blazingsql.BlazingContext.create_table¶
-
BlazingContext.
create_table
(table_name, input, **kwargs)¶ Create a BlazingSQL table.
table_name : string of table name. input : data source for table.
cudf.Dataframe, dask_cudf.DataFrame, pandas.DataFrame, filepath for csv, orc, parquet, etc…
- file_format (optional)string describing the file format
(e.g. “csv”, “orc”, “parquet”) this field must only be set if the files do not have an extension.
- local_files (optional)boolean, must be set to True if workers
only have access to a subset of the files belonging to the same table. In such a case, each worker will load their corresponding partitions.
- get_metadata (optional)boolean, to use parquet and orc metadata,
defaults to True. When set to False it will skip the process of getting metadata.
Create table from cudf.DataFrame:
>>> import cudf >>> df = cudf.DataFrame() >>> df['a'] = [6, 9, 1, 6, 2] >>> df['b'] = [7, 2, 7, 1, 2]
>>> from blazingsql import BlazingContext >>> bc = BlazingContext() BlazingContext ready >>> bc.create_table('sample_df', df) <pyblazing.apiv2.context.BlazingTable at 0x7f22f58371d0>
Create table from local file in ‘data’ directory:
>>> bc.create_table('taxi', 'data/nyc_taxi.csv', header=0) <pyblazing.apiv2.context.BlazingTable at 0x7f73893c0310>
Register and create table from a public AWS S3 bucket:
>>> bc.s3('blazingsql-colab', bucket_name='blazingsql-colab') >>> bc.create_table('taxi', >>> 's3://blazingsql-colab/yellow_taxi/1_0_0.parquet') <pyblazing.apiv2.context.BlazingTable at 0x7f09264c0310>