Skip to main content
Version: v3

Python SDK

Installation

pip install canner-python-client

Constructor

client = canner.client.bootstrap(
endpoint='https://web.default.myname.apps.cannerdata.com/web',
workspace_id='444e8753-a4c0-4875-bdc0-834c79061d56',
token='Y2xpZW50XzA0OTgzODM4LWNhZjktNGNmZi1hNDA4LWFkZDY3ZDc5MjIxNjo2N2YyNGY5OWEzYjFiZTEyZTg2MDI2MmMzNGQzZDRiYQ=='
)
NameTypeDescription
endpointstringPart of Canner Enterprise's URL to /web, eg: https://web.default.myname.apps.cannerdata.com/web.
tokenstringPersonal access token, please refer to Get Personal Access Token
workspace_idstringEach workspace will have a unique ID, which will be displayed in the URL of the workspace. For example https://web.default.myname.apps.cannerdata.com/web/workspaces/444e8753-a4c0-4875-bdc0-834c79061d56

Saved SQL Operations

List Saved Query

  • list_saved_query(): Array<string>

    Lists all the titles of stored SQL statements in a workspace.

    queries = client.list_saved_query()
    print(queries)
    # ['select_users1', 'select_users2']

Use Saved Query

  • use_saved_query(title: string, data_format: data_format, cache_refresh: boolean, cache_ttl: number): Query

    After the execution is completed, the returned object is Query, which provides different methods to operate on the data. For example, You can get all the data through query.get_all().

    • title: The title of the SQL Query to execute

    • data_format: Determine the data format returned by the query. Currently, there are three types: list, df, np, and the default is list.

    • cache_refresh: True or False, the default is False. Whether to update the existing cache data, if you want to get cache data, this option should set to False.

    • cache_ttl: The number of seconds, the earliest allowed to cache data a few seconds ago, the default is 86400 (allow cache within one day). Only valid when cache_refresh=False.

      queries = client.list_saved_query()
      # ['select_users1', 'select_users2']
      query = client.use_saved_query(queries[0], data_format='list')
      query. wait_for_finish()
      data = query. get_all()

Query Operations

Generate Query

  • gen_query(sql: string, data_format: data_format): Query

    After the execution is completed, the returned object is Query, which provides different methods to operate on the data. For example, you can get all the data through query.get_all().

    • sql: the SQL statement to execute

    • data_format: Determine the data format returned by the query. Currently, there are three types: list, df, np, and the default is list.

      query = client.gen_query('select * from canner.myworkspace.users', data_format='list')
      query. wait_for_finish()
      data = query. get_all()

Query

Query is the object returned by client.gen_query and client.use_saved_query.

example: use_saved_query

queries = client.list_saved_query()
# ['select_users1', 'select_users2']
query = client. use_saved_query(queries[0])
query. wait_for_finish()
data = query. get_all()

example: gen_query

query = client.gen_query('select * from canner.myworkspace.users')
query. wait_for_finish()
data = query. get_all()

Properties

  • columns

    columns information

    query.columns
    # [{'name': 'status',
    # 'type': 'varchar',
    # 'typeSignature': {'rawType': 'varchar',
    # 'typeArguments': [],
    # 'literalArguments': [],
    # 'arguments': [{'kind': 'LONG_LITERAL', 'value': 2147483647}]}
    # }]
  • row_count

    Total data

    query.row_count
    #100

Get Data

  • wait_for_finish(timeout=5, period=3)

    Executing SQL is asynchronous. Before obtaining data or any result-related information such as columns, row_count, etc., you must first ensure that the SQL execution is completed.

    • timeout: The number of seconds, the maximum waiting time.
    • period: The number of seconds, every few seconds to check whether the Query status is complete.
  • get_all()

    Return all data. When Data Format is list or df, the columns will be included.

  • get_first(limit: number)

    Return the previous data, the default is one. When Data Format is list or df, the columns will be included.

  • get_last(limit: number)

    Return the last few data, the default is one. When Data Format is list or df, the columns will be included.

  • get(limit: number, offset: number)

    Return partial data according to the given limit and offset. When Data Format is list or df, the columns will be included.

Data Format

There are currently 3 data formats available

  • list: Python list, the default value
  • df: DataFrame
  • np: Numpy.ndarray

You can specify it in client.gen_query or client.use_saved_query

query = client.use_saved_query('query_name', data_format="list")
print(type(query. get_first()))
# <class 'list'>
query = client. use_saved_query('query_name', data_format="df")
print(type(query. get_first()))
# <class 'pandas. core. frame. DataFrame'>
query = client. use_saved_query('query_name', data_format="np")
print(type(query. get_first()))
# <class 'numpy.ndarray'>

It is also possible to change query.data_format at any time to use a different data_format.

query = client. use_saved_query('query_name')

query.data_format = 'list'
print(type(query. get_first()))
# <class 'list'>
query.data_format = 'df'
print(type(query. get_first()))
# <class 'pandas. core. frame. DataFrame'>
query.data_format = 'np'
print(type(query. get_first()))
# <class 'numpy.ndarray'>