Commit e6693581 authored by Matthias Weidenthaler's avatar Matthias Weidenthaler
Browse files

Removed upload functionality from readme, added L2 task commit, metadata based...

Removed upload functionality from readme, added L2 task commit, metadata based file query, L2 task processing state query, star catalog query functionalities to readme
parent 690a4efc
This repository provides the following functionalities:
[1. Read or Download a File From S3 Storage](#1-read-or-download-a-file-from-s3-storage)
[2. Commit For File Processing](#2-commit-for-file-processing)
[3. Query a List Of L1/L2 Fits-Files By Metadata Values](#3-query-a-list-of-l1l2-fits-files-by-metadata-values)
[4. Query a L2 Processing Tasks State](#4-query-a-l2-processing-tasks-state)
[5. Query a Star Catalog](#5-query-a-star-catalog)
# 1. Read or Download a File from S3 storage
Supported are two distinct ways of reading from s3 storage.
1) [Download to a local file](#下载)
2) [use open() to get a file object](#open-for-read)
## Configuration
**astropy 需升级至 5.3**
**老写法同时兼容本地nas和云上s3,只要读路径以s3:// 协议开头会自动识别**
**老写法同时兼容本地nas和云上s3,只要读路径以s3:// 协议开头会自动识别**
如果需要读S3时,需要传入s3的密钥和endpoint等配置,有两种方法可选
如果需要读S3时,需要传入s3的密钥和endpoint等配置,有两种方法可选
### 方法1 环境变量
执行下面三个环境变量,本文档下面介绍到的所有方法都会尝试读取环境变量以获取配置
......@@ -14,30 +27,16 @@ s3_options = {
```
### 方法2 每次调用方法时传入 s3_options
```
在第一个kwargs参数位置指定s3_options, s3_options示例:
```json
```python
s3_options = {
"key": "minioadmin",
"secret": "minioadmin",
"endpoint_url": "http://localhost:9000"
}
```
## 本地到s3的上传与下载
### 上传
```python
from csst_fs import s3_fs
# single file,s3_options from env
s3_fs.put('requirements.txt', 's3://csst-prod/gaia/test/requirements.txt')
# single file,s3_options from function parameter
s3_fs.put('requirements.txt', 's3://csst-prod/gaia/test/requirements.txt', s3_options=s3_options)
# folder,to s3 s3://csst-prod/common
s3_fs.put('./common', 's3://csst-prod/', recursive=True)
s3_fs.put('./common', 's3://csst-prod/', s3_options=s3_options, recursive=True)
```
## 从s3下载到本地
### 下载
```python
from csst_fs import s3_fs
......@@ -52,196 +51,209 @@ s3_fs.info('s3://csst-prod/gaia/data')
s3_fs.info('s3://csst-prod/gaia/test/requirements.txt', s3_options=s3_options)
```
### Open for read/write
### Open for read
```python
from csst_fs import s3_fs
# open single file (s3 or local)
with s3_fs.open('s3://csst-prod/gaia/data') as file:
file.read()
with s3_fs.open('s3://csst-prod/gaia/test/requirements.txt', s3_options=s3_options, mode='w') as file:
file.write("CSST")
```
### Check if the given file path exists
```python
from csst_fs import fs
# local or on s3, depending on the given path
fs.isfile('requirements.txt')
fs.isfile('s3://csst-prod/test.txt')
fs.isfile('s3://csst-prod/test.txt', s3_options=s3_options)
```
### Delete a file from local or s3
```python
from csst_fs import fs
# local or on s3, depending on the given path
fs.delete('requirements.txt') # uses os.remove
fs.delete('test', dir_fd=1)
fs.delete('s3://csst-prod/test.txt') # uses fsspec.delete
fs.delete('s3://csst-prod/test.txt', recursive=True, maxdepth=3)
fs.delete('s3://csst-prod/test.txt', s3_options=s3_options)
```
# 2. Commit For File Processing
## astropy直接读写s3的写法适配
### fits.open
#### 老写法
```python
fits.open(path)
```
usage reference:
[https://docs.astropy.org/en/stable/io/fits/api/files.html#open](https://docs.astropy.org/en/stable/io/fits/api/files.html#open)
#### 新写法
```python
from csst_fs import fsspec_fits
fsspec_fits.open("s3://csst-prod/gaia/xx.fits")
fsspec_fits.open("s3://csst-prod/gaia/xx.fits", s3_options=s3_options)
```
The function will return a successfull response as soon as the file content is successfully stored and queued for further processing. Otherwise, the function will handle errors appropriately.
A successfull response contains a task_id referring to the queued processing task. This can be used in [4. Query a L2 Processing Tasks State](#4-query-a-l2-processing-tasks-state) for querying a processing task's current state.
### fits.getheader
#### 老写法
```python
fits.getheader(filename=in_image_path, ext=1)
```
usage reference:
[https://docs.astropy.org/en/stable/io/fits/api/files.html#getheader](https://docs.astropy.org/en/stable/io/fits/api/files.html#getheader)
## Configuration
The helper will send HTTP requests to an external API. INGESTION_API_URL env variable should be set accordingly.
## Function: `submit_file_for_ingestion`
#### 新写法
```python
from csst_fs import fsspec_fits
fsspec_fits.getheader(filename=in_image_path, ext=1)
fsspec_fits.getheader(filename=in_image_path, ext=1, s3_options=s3_options)
def submit_file_for_ingestion(file_content: str, file_name: str) -> dict:
"""
Submit a file's content and file name to the ingestion API.
Args:
file_content (str): The file's content as string representation
file_name (str): The file name for storing the file after ingestion.
Returns:
dict: A dict containing a task_id, referring the the queued processing task's id.
E.g.
{
"task_id": "5",
}
"""
```
### fits.getdata
#### 老写法
```python
fits.getdata(in_ref_flat)
fits.getdata( in_ref_shutter, ext=1)
```
usage reference:
[https://docs.astropy.org/en/stable/io/fits/api/files.html#getdata](https://docs.astropy.org/en/stable/io/fits/api/files.html#getdata)
# 3. Query a List Of L1/L2 Fits-Files By Metadata Values
Query for file info by metadata values.
#### 新写法
```python
from csst_fs import fsspec_fits
fsspec_fits.getdata(in_ref_flat)
fsspec_fits.getdata(in_ref_flat, s3_options=s3_options)
fsspec_fits.getdata( in_ref_shutter, ext=1)
fsspec_fits.getdata( in_ref_shutter, s3_options=s3_options, ext=1)
```
### fits.getval
#### 老写法
```python
fits.getval(filename, keyword)
fits.getval(filename, keyword, ext=1)
```
usage reference:
[https://docs.astropy.org/en/stable/io/fits/api/files.html#getdata](https://docs.astropy.org/en/stable/io/fits/api/files.html#getval)
## Configuration
The helper will send HTTP requests to an external API. SEARCH_API_URL env variable should be set accordingly.
#### 新写法
## Function: `search_with_basic_filters`
```python
from csst_fs import fsspec_fits
fsspec_fits.getval(filename, keyword)
fsspec_fits.getval(filename, keyword, s3_options=s3_options)
fsspec_fits.getval(filename, keyword, ext=1)
fsspec_fits.getval(filename, keyword, s3_options=s3_options, ext=1)
def search_with_basic_filters(
filter: Dict[str, Any],
key: List[str],
) -> List[Dict[str, Any]]:
"""
Query for file info by metadata values.
Args:
filter: The filter dict described below.
key: A list of string values, corresponding to metadata keys that should be included in the output.
Returns:
A List[Dict] of matching documents containing a file_path value and the keys set as 'key' parameter under 'metadata'.
E.g. with key = ["dataset", "instrument", "obs_group", "obs_id"]
then returns:
[
{
"file_path": "CSST_L0/MSC/SCI/60310/10100000000/MS/CSST_MSC_MS_SCIE_20290225043953_20290225044223_10100000000_03_L0_V01.fits",
"metadata": {
"dataset":"csst-msc-c11-1000sqdeg-wide-test-v2",
"instrument":"MSC",
"obs_group":"W1",
"obs_id":"10200000000"
},
},
]
"""
```
### header.tofile
#### 老写法
## Filter Syntax
All filters are combined with logical AND (every clause must match).
1) String equality
```python
header.tofile(out_head_path)
filter = {
"dataset": "csst-msc-c11-1000sqdeg-wide-test-v2",
"obs_type": "WIDE",
}
```
usage reference:
[https://docs.astropy.org/en/stable/io/fits/api/headers.html#astropy.io.fits.Header.tofile](https://docs.astropy.org/en/stable/io/fits/api/headers.html#astropy.io.fits.Header.tofile)
#### 新写法
2) Numeric equality and ranges
Supported inequality operators are:
lt/gt: less/greater than
lte/gte: less/greater than or equal
```python
from csst_fs import fsspec_header
fsspec_header.tofile(header, out_head_path)
fsspec_header.tofile(header, out_head_path, s3_options=s3_options)
```
### header.fromfile
#### 老写法
```python
header.fromfile(filename)
filter = {
"dataset": "csst-msc-c11-1000sqdeg-wide-test-v2",
"ra": {
"gte": 250,
"lte": 260
},
"qc_status": 0,
}
```
usage reference:
[https://docs.astropy.org/en/stable/io/fits/api/headers.html#astropy.io.fits.Header.fromfile](https://docs.astropy.org/en/stable/io/fits/api/headers.html#astropy.io.fits.Header.fromfile)
#### 新写法
3) Timestamp equality and ranges
```python
from csst_fs import fsspec_header
fsspec_header.fromfile(filename)
fsspec_header.fromfile(filename, s3_options=s3_options)
filter = {
"created_date": "2015-08-04T11:00:00",
"obs_date": {
"gt": "2015-06-01T10:00:00",
"lt": "2015-07-01T10:00:00",
},
}
```
# 4. Query a L2 Processing Tasks State
Query the processing state of a processing task given a L2 task id.
### HDUList.writeto
#### 老写法
```python
hdul_img.writeto(hdul_img, out_combined_img, overwrite=True)
```
usage reference:
[https://docs.astropy.org/en/stable/io/fits/api/hdulists.html#astropy.io.fits.HDUList.writeto](https://docs.astropy.org/en/stable/io/fits/api/hdulists.html#astropy.io.fits.HDUList.writeto)
#### 新写法
```python
from csst_fs import fsspec_HDUList
fsspec_HDUList.writeto(hdul_img, out_combined_img, overwrite=True)
fsspec_HDUList.writeto(hdul_img, out_combined_img, s3_options=s3_options, overwrite=True)
```
## Configuration
The helper will send HTTP requests to an external API. QUERY_TASK_STATE_API_URL env variable should be set accordingly.
### HDUList.fromfile
#### 老写法
## Function: `query_processing_task_state`
```python
hdul_img = fits.HDUList.fromfile("hdulist.fits")
```
usage reference:
[https://docs.astropy.org/en/stable/io/fits/api/hdulists.html#astropy.io.fits.HDUList.fromfile](https://docs.astropy.org/en/stable/io/fits/api/hdulists.html#astropy.io.fits.HDUList.fromfile)
#### 新写法
```python
from csst_fs import fsspec_HDUList
hdul_img = fsspec_HDUList.fromfile("hdulist.fits")
hdul_img = fsspec_HDUList.fromfile("hdulist.fits", cache=False, s3_options=s3_options)
def query_processing_task_state(
task_id: str
) -> Dict[str, Any]
"""
Query the processing state of a processing task given a L2 task id.
Args:
task_id: Task id of the L2 processing task
Returns:
Dictionary of the following format, including information about the current state of the corresponding processing task.
The following strings are valid state values: tbd
E.g.
{
"state": "submission_pending",
}
"""
```
# 5. Query a Star Catalog
Query a star catalog by column values given a ra, dec and radius preselection.
### table.Table.read
#### 老写法
```python
from astropy import table
table.Table.read(out_gaia_ldac, hdu=2)
```
usage reference:
[https://docs.astropy.org/en/stable/api/astropy.table.Table.html#astropy.table.Table.read](https://docs.astropy.org/en/stable/api/astropy.table.Table.html#astropy.table.Table.read
)
## Configuration
The helper will send HTTP requests to an external API. STAR_CATALOG_SEARCH_API_URL env variable should be set accordingly.
#### 新写法
## Function: `search_with_basic_filters`
```python
from csst_fs import fsspec_table
fsspec_table.read(out_gaia_ldac, hdu=2)
fsspec_table.read(out_gaia_ldac, s3_options=s3_options, hdu=2)
```
### table.Table.write
#### 老写法
```python
ps.write(ref, format='fits', overwrite=True)
def query_star_catalog(
catalog_name: str,
filter: Dict[str, Any],
key: List[str],
) -> List[Dict[str, Any]]:
"""
Query a star catalog by column values given a ra, dec and radius preselection.
Args:
catalog_name: Name of the star catalog (e.g. msc_l1_mbi_catmix)
filter: The filter dict described below.
The following keys MUST be set:
{
"ra": 40.3,
"dec": 21.9,
"radius": 0.2,
}
Ra, dec values pinpoint a location, 'radius' defines a radius in [deg] around this point.
Only star catalog objects withing this area are considered for subsequent filtering.
Setting ranges with (lt, gt, lte, gte) for ra, dec values is not supported.
key: A list of string values, corresponding to the colum names that should be present in the return value.
Returns:
A List[Dict] of matching star catalog objects, containing key-value pairs for the keys set as 'key' parameter.
E.g. with key = ["x", "bulge_flux", "ab"]
then returns:
[
{
"x": 995.27,
"bulge_flux": "3.2",
"ab": 1.2,
},
]
"""
```
usage reference:
[https://docs.astropy.org/en/stable/api/astropy.table.Table.html#astropy.table.Table.write](https://docs.astropy.org/en/stable/api/astropy.table.Table.html#astropy.table.Table.write)
#### 新写法
## Filter Syntax
All filters are combined with logical AND (every clause must match).
1) String equality
```python
from csst_fs import fsspec_table
fsspec_table.write(ps, ref, format='fits', overwrite=True)
fsspec_table.write(ps, ref, format='fits', s3_options=s3_options, overwrite=True)
filter = {
"ra": 40.3,
"dec": 21.9,
"radius": 0.2,
"msc_photid": "00101000703350610200001812",
"detector": "06",
}
```
2) Numeric equality and ranges
Supported inequality operators are:
lt/gt: less/greater than
lte/gte: less/greater than or equal
```python
filter = {
"ra": 40.3,
"dec": 21.9,
"radius": 0.2,
"msc_photid": "00101000703350610200001812",
"x": {
"gte": 996,
"lte": 1000,
},
"ratio_disk": -9999,
}
```
\ No newline at end of file
Supports Markdown
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment