Python SDK
Installation
pip install canner-python-client
Constructor
- Normal Python Env (Standalone Mode)
- Normal Python Env (Cluster Mode)
client = canner.client.bootstrap(
endpoint='https://web.default.myname.apps.cannerdata.com/web',
workspace_id='444e8753-a4c0-4875-bdc0-834c79061d56',
token='Y2xpZW50XzA0OTgzODM4LWNhZjktNGNmZi1hNDA4LWFkZDY3ZDc5MjIxNjo2N2YyNGY5OWEzYjFiZTEyZTg2MDI2MmMzNGQzZDRiYQ=='
)
Name | Type | Description |
---|---|---|
endpoint | string | Part of Canner Enterprise's URL to /web , eg: https://web.default.myname.apps.cannerdata.com/web . |
token | string | Personal access token, please refer to Get Personal Access Token |
workspace_id | string | Each workspace will have a unique ID, which will be displayed in the URL of the workspace. For example https://web.default.myname.apps.cannerdata.com/web/workspaces/ 444e8753-a4c0-4875-bdc0-834c79061d56 |
client = canner.client.bootstrap(
endpoint='https://web.default.myname.apps.cannerdata.com',
workspace_id='444e8753-a4c0-4875-bdc0-834c79061d56',
token='Y2xpZW50XzA0OTgzODM4LWNhZjktNGNmZi1hNDA4LWFkZDY3ZDc5MjIxNjo2N2YyNGY5OWEzYjFiZTEyZTg2MDI2MmMzNGQzZDRiYQ=='
)
Name | Type | Description |
---|---|---|
endpoint | string | Canner Enterprise URL, eg: https://web.default.myname.apps.cannerdata.com . |
token | string | Personal access token, please refer to Get Personal Access Token |
workspace_id | string | Each workspace will have a unique ID, which will display in the workspace URL. For example https://web.default.myname.apps.cannerdata.com/workspaces/ 444e8753-a4c0-4875-bdc0-834c79061d56 |
Saved SQL Operations
List Saved Query
list_saved_query(): Array<string>
Lists all the titles of stored SQL statements in a workspace.
queries = client.list_saved_query()
print(queries)
# ['select_users1', 'select_users2']
Use Saved Query
use_saved_query(title: string, data_format: data_format, cache_refresh: boolean, cache_ttl: number): Query
After the execution is completed, the returned object is Query, which provides different methods to operate on the data. For example, You can get all the data through
query.get_all()
.title
: The title of the SQL Query to executedata_format
: Determine the data format returned by the query. Currently, there are three types:list
,df
,np
, and the default islist
.cache_refresh
:True
orFalse
, the default isFalse
. Whether to update the existing cache data, if you want to get cache data, this option should set toFalse
.cache_ttl
: The number of seconds, the earliest allowed to cache data a few seconds ago, the default is86400
(allow cache within one day). Only valid whencache_refresh=False
.queries = client.list_saved_query()
# ['select_users1', 'select_users2']
query = client.use_saved_query(queries[0], data_format='list')
query. wait_for_finish()
data = query. get_all()
Query Operations
Generate Query
gen_query(sql: string, data_format: data_format): Query
After the execution is completed, the returned object is Query, which provides different methods to operate on the data. For example, you can get all the data through
query.get_all()
.sql
: the SQL statement to executedata_format
: Determine the data format returned by the query. Currently, there are three types:list
,df
,np
, and the default islist
.query = client.gen_query('select * from canner.myworkspace.users', data_format='list')
query. wait_for_finish()
data = query. get_all()
Query
Query
is the object returned by client.gen_query
and client.use_saved_query
.
example: use_saved_query
queries = client.list_saved_query()
# ['select_users1', 'select_users2']
query = client. use_saved_query(queries[0])
query. wait_for_finish()
data = query. get_all()
example: gen_query
query = client.gen_query('select * from canner.myworkspace.users')
query. wait_for_finish()
data = query. get_all()
Properties
columns
columns
informationquery.columns
# [{'name': 'status',
# 'type': 'varchar',
# 'typeSignature': {'rawType': 'varchar',
# 'typeArguments': [],
# 'literalArguments': [],
# 'arguments': [{'kind': 'LONG_LITERAL', 'value': 2147483647}]}
# }]row_count
Total data
query.row_count
#100
Get Data
wait_for_finish(timeout=5, period=3)
Executing SQL is asynchronous. Before obtaining data or any result-related information such as
columns
,row_count
, etc., you must first ensure that the SQL execution is completed.timeout
: The number of seconds, the maximum waiting time.period
: The number of seconds, every few seconds to check whether the Query status is complete.
get_all()
Return all data. When data_format is
list
ordf
, the columns will be included.get_first(limit: number)
Return the previous data, the default is one. When data_format is
list
ordf
, the columns will be included.get_last(limit: number)
Return the last few data, the default is one. When data_format is
list
ordf
, the columns will be included.get(limit: number, offset: number)
Return partial data according to the given
limit
andoffset
. When data_format islist
ordf
, the columns will be included.
Data Format
There are currently 3 data formats available
- list:
Python list
, the default value - df:
DataFrame
- np:
Numpy.ndarray
You can specify it in client.gen_query
or client.use_saved_query
query = client.use_saved_query('query_name', data_format="list")
print(type(query. get_first()))
# <class 'list'>
query = client. use_saved_query('query_name', data_format="df")
print(type(query. get_first()))
# <class 'pandas. core. frame. DataFrame'>
query = client. use_saved_query('query_name', data_format="np")
print(type(query. get_first()))
# <class 'numpy.ndarray'>
It is also possible to change query.data_format
at any time to use a different data_format.
query = client. use_saved_query('query_name')
query.data_format = 'list'
print(type(query. get_first()))
# <class 'list'>
query.data_format = 'df'
print(type(query. get_first()))
# <class 'pandas. core. frame. DataFrame'>
query.data_format = 'np'
print(type(query. get_first()))
# <class 'numpy.ndarray'>