A flag for SimpleNodeDB. Default is true. When data is retrieved from the database as an array of objects, BIGINT values are automatically converted to integers, which are easier to work with in JavaScript. If you want actual bigint values, set this option to false.
A connection to a DuckDB database.
A DuckDB database.
A flag indicating whether debugging information should be logged. Defaults to false.
The number of rows to log when debugging. Defaults to 10.
For internal use only. If you want to run a SQL query, use the customQuery method.
Optional
bigOptional
spatial?: booleanA flag to install the the spatial extension. Default is false. If true, extension will be loaded, which allows geospatial analysis.
A worker to make DuckDB work.
Assigns bins for specified column values within a table, based on an interval size.
// Assigns a bin for each row in new column bins based on column1 values, with an interval of 10.
await sdb.bins("tableA", "column1", 10, "bins")
// If the minimum value in column1 is 5, the bins will follow this pattern: "[5-14]", "[15-24]", "[25-34]", etc.
// Same thing, but with the bins starting at a specific value.
await sdb.bins("tableA", "column1", 10, "bins", {startValue: 0})
// The bins will follow this pattern: "[0-9]", "[10-19]", "[20-29]", etc.
The name of the table for which bins will be computed.
The column containing values from which bins will be computed.
The interval size for binning the values.
The name of the new column where the bins will be stored.
An optional object with configuration options:
Optional
startThe starting value for binning. Defaults to the minimum value in the specified column.
Calculates correlations between columns in a table.
If no x and y columns are specified, the method computes the correlations of all numeric columns combinations. It's important to note that correlation is symmetrical: the correlation of x over y is the same as y over x.
// Computes all correlations between all numeric columns in tableA and overwrite tableA with the results.
await sdb.correlations("tableA")
// Same but results are stored in tableB.
await sdb.correlations("tableA", {outputTable: "tableB"})
// Computes all correlations between a specific x column and all other numeric columns.
await sdb.correlations("tableA", {x: "column1"})
// Computes the correlations between a specific x and y columns.
await sdb.correlations("tableA", {x: "column1", y: "column2"})
The name of the table.
An optional object with configuration options:
Optional
categories?: string | string[]The column or columns that define categories. Correlation calculations will be run for each category.
Optional
decimals?: numberThe number of decimal places to round the correlation values.
Optional
outputAn option to store the results in a new table.
Optional
x?: stringThe column name for the x values. Default is all numeric columns.
Optional
y?: stringThe column name for the y values. Default is all numeric columns.
Performs linear regression analysis and creates a table with regression results. The results include the slope, the y-intercept the R-squared.
If no x and y columns are specified, the method computes the linear regression analysis of all numeric columns permutations. It's important to note that linear regression analysis is asymmetrical: the linear regression of x over y is not the same as y over x.
// Computes all linear regressions between all numeric columns in tableA and overwrites tableA.
await sdb.linearRegressions("tableA")
// Same but stores the results in tableB.
await sdb.linearRegressions("tableA", {outputTable: "tableB"})
// Computes all linear regressions between a specific x column and all other numeric columns.
await sdb.linearRegressions("tableA", {x: "column1"})
// Computes the linear regression between a specific x and y columns.
await sdb.linearRegressions("tableA", {x: "column1", y: "column2"})
The name of the table.
An optional object with configuration options:
Optional
categories?: string | string[]The column or columns that define categories. Correlation calculations will be run for each category.
Optional
decimals?: numberThe number of decimal places to round the regression coefficients.
Optional
outputOptional
x?: stringThe column name for the independent variable (x values) in the linear regression analysis.
Optional
y?: stringThe column name for the dependent variable (y values) in the linear regression analysis.
Normalizes the values in a column using min-max normalization.
// Normalizes the values in the column1 from tableA.
await sdb.normalize("tableA", "column1")
The name of the table.
The name of the column in which values will be normalized.
The name of the new column where normalized values will be stored.
An optional object with configuration options:
Optional
categories?: string | string[]The column or columns that define categories for the normalization.
Optional
decimals?: numberThe number of decimal places to round the normalized values.
Identifies outliers using the Interquartile Range (IQR) method.
// Looks for outliers in column age from table1. Creates a new column outliers with TRUE or FALSE values.
await sdb.outliersIQR("table1", "age", "outliers")
The name of the table containing the column for outlier detection.
The name of the column in which outliers will be identified.
The name of the new column where the bins will be stored.
An optional object with configuration options:
Optional
categories?: string | string[]The column or columns that define categories for outliers.
Computes proportions within a row for specified columns in a given table.
For example, let's say this is tableA.
Year | Men | Women | NonBinary |
---|---|---|---|
2021 | 564 | 685 | 145 |
2022 | 354 | 278 | 56 |
2023 | 856 | 321 | 221 |
We compute the proportions of men, women, and non-binary on each row.
await sdb.proportionsHorizontal("tableA", ["Men", "Women", "NonBinary"])
The table now looks like this.
Year | Men | Women | NonBinary | MenPerc | WomenPerc | NonBinaryPerc |
---|---|---|---|---|---|---|
2021 | 564 | 685 | 145 | 0.4 | 0.49 | 0.10 |
2022 | 354 | 278 | 56 | 0.51 | 0.4 | 0.08 |
2023 | 856 | 321 | 221 | 0.61 | 0.23 | 0.16 |
By default, the new columns have the suffix "Perc", but you use something else if you want.
await sdb.proportionsHorizontal("tableA", ["Men", "Women", "NonBinary"], {suffix: "Prop"})
Here's the result with a different suffix.
Year | Men | Women | NonBinary | MenProp | WomenProp | NonBinaryProp |
---|---|---|---|---|---|---|
2021 | 564 | 685 | 145 | 0.4 | 0.49 | 0.10 |
2022 | 354 | 278 | 56 | 0.51 | 0.4 | 0.08 |
2023 | 856 | 321 | 221 | 0.61 | 0.23 | 0.16 |
The name of the table.
The columns for which proportions will be computed on each row.
An optional object with configuration options:
Optional
decimals?: numberThe number of decimal places to round the computed proportions.
Optional
suffix?: stringA string suffix to append to the names of the new columns storing the computed proportions. Defaults to "Perc".
Computes proportions over a column's values within a table.
// This will add a column perc with the result of each column1 value divided by the sum of all column1 values.
await sdb.proportionsVertical("tableA", "column1", "perc")
The name of the table.
The column containing values for which proportions will be computed. The proportions are calculated based on the sum of values in the specified column.
The name of the new column where the proportions will be stored.
An optional object with configuration options:
Optional
categories?: string | string[]The column or columns that define categories for computing proportions. This can be a single column name or an array of column names.
Optional
decimals?: numberThe number of decimal places to round the computed proportions.
Assigns quantiles for specified column values within a table.
// Assigns a quantile from 1 to 10 for each row in new column quantiles, based on values from column1.
await sdb.quantiles("tableA", "column1", 10, "quantiles")
// Same thing, except the values in column2 are used as categories.
await sdb.quantiles("tableA", "column1", 10, "quantiles", {categories: "column2"})
The name of the table.
The column containing values from which quantiles will be assigned.
The number of quantiles.
The name of the new column where the assigned quantiles will be stored.
An optional object with configuration options:
Optional
categories?: string | string[]The column or columns that define categories for computing quantiles. This can be a single column name or an array of column names.
Assigns ranks in a new column based on specified column values within a table.
// Computes ranks in the new column rank from the column1 values.
await sdb.ranks("tableA", "column1", "rank")
// Computing ranks in the new column rank from the column1 values. Using the values from column2 as categories.
await sdb.ranks("tableA", "column1", "rank", {categories: "column2"})
The name of the table.
The column containing values to be used for ranking.
The name of the new column where the ranks will be stored.
An optional object with configuration options:
Optional
categories?: string | string[]The column or columns that define categories for ranking.
Optional
noA boolean indicating whether to assign ranks without gaps. Defaults to false.
Computes rolling aggregations, like a rolling average. For rows without enough preceding or following rows, returns NULL. For this method to work properly, don't forget to sort your data first.
// 7-day rolling average of values in column1 with the 3 preceding and 3 following rows.
await sdb.rolling("tableA", "column1", "rollingAvg", "mean", 3, 3)
The name of the table
The name of the column storing the values to be aggregated
The name of the new column in which the computed values will be stored
How to aggregate the values
How many preceding rows to include
How many following rows to include
An optional object with configuration options:
Optional
categories?: string | string[]The category or categories to be used for the aggragation.
Optional
decimals?: numberThe number of decimal places to round the summarized values.
Creates a summary table based on specified values, categories, and summary operations.
// Summarizes all numeric columns with all available summary operations. Table tableA will be overwritten with the results.
await sdb.summarize("tableA")
// Same, but the results will be stored in tableB.
await sdb.summarize("tableA", {outputTable: "tableB"})
// Summarizes a specific column with all available summary operations. Values can be an array of column names, too.
await sdb.summarize("tableA", {values: "column1"})
// Summarizes a specific column with all available summary operations and use the values in another column as categories. Categories can be an array of column names, too.
await sdb.summarize("tableA", {values: "column1", categories: "column2"})
// Summarizes a specific column with a specific summary operation and use the values in another column as categories. Summaries can be an array of summary operations, too.
await sdb.summarize("tableA", {values: "column1", categories: "column2", summaries: "mean"})
// Summarizes and round values with a specific number of decimal places.
await sdb.summarize("tableA", {values: "column1", categories: "column2", summaries: "mean", decimals: 4})
The name of the table to be summarized.
An optional object with configuration options:
Optional
categories?: string | string[]The column or columns that define categories for the summarization. This can be a single column name or an array of column names.
Optional
decimals?: numberThe number of decimal places to round the summarized values.
Optional
outputAn option to store the results in a new table.
Optional
summaries?: "min" | "max" | "mean" | "median" | "sum" | "count" | "countUnique" | "skew" | "stdDev" | "var" | ("min" | "max" | "mean" | "median" | "sum" | "count" | "countUnique" | "skew" | "stdDev" | "var")[]The summary operations to be performed. This can be a single summary operation or an array of summary operations.
Optional
values?: string | string[]The column or columns whose values will be summarized. This can be a single column name or an array of column names.
Computes the Z-score.
// Calculates the Z-score for the values in column age and puts the results in column sigma.
await sdb.zScore("table1", "age", "sigma")
The name of the table.
The name of the column for which Z-Score will be calculated.
The name of the new column where the bins will be stored.
An optional object with configuration options:
Optional
categories?: string | string[]The column or columns that define categories for zScores.
Optional
decimals?: numberThe number of decimal places to round the Z-score values.
Returns the bottom n rows from a table. The last row will be returned first. To keep the original order of the data, use the originalOrder option.
// Last row will be returned first.
const bottom10 = await sdb.getBottom("inventory", 10)
// Last row will be returned last.
const bottom10 = await sdb.getBottom("inventory", 10, {originalOrder: true})
// With a condition
const bottom10Books = await sdb.getBottom("inventory", 10, {condition: `category = 'Books'` })
The name of the table.
The number of rows to return.
An optional object with configuration options:
Optional
condition?: stringThe filtering conditions specified as a SQL WHERE clause. Defaults to no condition.
Optional
originalA boolean indicating whether the rows should be returned in their original order. Default is false, meaning the last row will be returned first.
Returns the data from a specified table with an optional condition.
// No condition. Returns all data.
const data = await sdb.getData("inventory")
// With condition
const books = await sdb.getData("inventory", {condition: "category = 'Book'"})
The name of the table from which to retrieve the data.
An optional object with configuration options:
Optional
condition?: stringA SQL WHERE clause condition to filter the data. Defaults to no condition.
Returns the first row from a table based on optional filtering conditions.
// No condition.
const firstRow = await sdb.getFirstRow("inventory")
// With condition
const firstRowBooks = await sdb.getFirstRow("inventory", {condition: "category = 'Book'"})
The name of the table.
An optional object with configuration options:
Optional
condition?: stringThe filtering conditions specified as a SQL WHERE clause. Defaults to no condition.
Returns the last row from a table based on optional filtering conditions.
// No condition.
const lastRow = await sdb.getLastRow("inventory")
// With condition
const lastRowBooks = await sdb.getLastRow("inventory", {condition: "category = 'Book'"})
The name of the table.
An optional object with configuration options:
Optional
condition?: stringThe filtering conditions specified as a SQL WHERE clause. Defaults to no condition.
Returns the maximum value from a specific column in a table.
const maximum = sdb.getMax("tableA", "column1")
The name of the table.
The name of the column.
Returns the mean value from a specific column in a table.
const mean = sdb.getMean("tableA", "column1")
The name of the table.
The name of the column.
An optional object with configuration options:
Optional
decimals?: numberThe number of decimal places to round the result to.
Returns the median value from a specific column in a table.
const median = sdb.getMedian("tableA", "column1")
The name of the table.
The name of the column.
An optional object with configuration options:
Optional
decimals?: numberThe number of decimal places to round the result to.
Returns the minimum value from a specific column in a table.
const minimum = sdb.getMin("tableA", "column1")
The name of the table.
The name of the column.
Returns the value of a specific quantile from the values in a given column of a table.
const firstQuartile = sdb.getQuantile("tableA", "column1", 0.25)
The name of the table.
The name of the column from which to calculate the quantile.
The quantile (between 0 and 1) to calculate. For example, 0.25 for the first quartile.
An optional object with configuration options:
Optional
decimals?: numberThe number of decimal places to round the result to.
Returns the skewness of values from a specific column in a table.
const skew = sdb.getSkew("tableA", "column1")
The name of the table.
The name of the column.
An optional object with configuration options:
Optional
decimals?: numberThe number of decimal places to round the result to.
Returns the standard deviation of values from a specific column in a table.
const standardDeviation = sdb.getStdDev("tableA", "column1")
The name of the table.
The name of the column.
An optional object with configuration options:
Optional
decimals?: numberThe number of decimal places to round the result to.
Returns the top n rows from a table.
const top10 = await sdb.getTop("inventory", 10)
// With a condition
const top10Books = await sdb.getTop("inventory", 10, {condition: `category = 'Books'` })
The name of the table.
The number of rows to return.
An optional object with configuration options:
Optional
condition?: stringThe filtering conditions specified as a SQL WHERE clause. Defaults to no condition.
Returns unique values from a specific column in a table. For convenience, it returns the value ascendingly.
const uniques = await sdb.getUniques("tableA", "column1")
The name of the table.
The name of the column from which to retrieve unique values.
Returns the values of a specific column in a table.
const values = await sdb.getValues("tableA", "column1")
The name of the table.
The name of the column.
Returns the variance of values from a specific column in a table.
const variance = sdb.getVar("tableA", "column1")
The name of the table.
The name of the column.
An optional object with configuration options:
Optional
decimals?: numberThe number of decimal places to round the result to.
Inserts rows formatted as an array of objects into an existing table.
const rows = [ { letter: "a", number: 1 }, { letter: "b", number: 2 }]
await sdb.insertRows("tableA", rows)
The name of the table to insert rows into.
An array of objects representing the rows to be inserted into the table.
Inserts all rows from one table (or multiple tables) into another existing table.
// Insert all rows from tableB into tableA.
await sdb.insertTable("tableA", "tableB")
// Insert all rows from tableB and tableC into tableA.
await sdb.insertTable("tableA", ["tableB", "tableC"])
The name of the table to insert rows into.
The name of the table(s) from which rows will be inserted.
Creates or replaces a table and loads an array of objects into it.
const data = [{letter: "a", number: 1}, {letter: "b", number: 2}]
await simpleDB.loadArray("tableA", data)
The name of the table to be created.
An array of objects representing the data.
Creates or replaces a table and loads data from an external file into it.
await sdb.loadData("tableA", "https://some-website.com/some-data.csv")
The name of the new table.
The URL of the external file containing the data. CSV, JSON, and PARQUET files are accepted.
An optional object with configuration options:
Optional
autoA boolean indicating whether to automatically detect the data format. Defaults to true.
Optional
delim?: stringThe delimiter used in the file. Applicable for DSV files. Defaults to ",".
Optional
fileThe type of the external file. Defaults to the file extension.
Optional
header?: booleanA boolean indicating whether the file contains a header row. Applicable for CSV files. Defaults to true.
Optional
skip?: numberThe number of rows to skip at the beginning of the file. Defaults to 0.
Adds a new column to a table based on a type (JavaScript or SQL types) and a SQL definition.
// Adds column3 to tableA. The column's values are floats (equivalent to DOUBLE in SQL) and are the results of the sum of values from column1 and column2.
await sdb.addColumn("tableA", "column3", "float", "column1 + column2")
The name of the table to which the new column will be added.
The name of the new column to be added.
The data type for the new column. JavaScript or SQL types.
SQL expression defining how the values should be computed for the new column.
Adds a new column to a table with the row number.
// Adds the row number in new column rowNumber.
await sdb.addRowNumber("tableA", "rowNumber")
The name of the table
The name of the new column storing the row number
Clones a column in a given table.
// Clones column1 as column2 from tableA
await sdb.cloneColumn("tableA", "column1", "column2")
The table in which the cloning should happen.
The original column.
The name of the cloned column.
Converts data types (JavaScript or SQL types) of specified columns in a table.
// Converts column1 to string and column2 to integer
await sdb.convert("tableA", {column1: "string", column2: "integer"})
// Same thing but with SQL types
await sdb.convert("tableA", {column1: "varchar", column2: "bigint"})
// Converts a string to a date
await sdb.convert("tableA", {column3: "datetime"}, {datetimeFormat: "%Y-%m-%d" })
The name of the table where data types will be converted.
An object mapping column names to the target data types for conversion.
An optional object with configuration options:
Optional
datetimeA string specifying the format for date and time conversions. The method uses strftime and strptime functions from DuckDB. For the format specifiers, see https://duckdb.org/docs/sql/functions/dateformat.
Optional
try?: booleanWhen true, the values that can't be converted will be replaced by NULL instead of throwing an error. Defaults to false.
Creates a new empty table with specified columns and data types (JavaScript or SQL). If the table already exists, it will be overwritten.
await sdb.createTable("employees", {
name: "string",
salary: "integer",
raise: "float",
})
The name of the table.
An object specifying the columns and their data types (JavaScript or SQL).
Performs a cross join operation between two tables returning all pairs of rows. With SimpleNodeDB, it might create a .tmp folder, so make sure to add .tmp to your gitignore.
// By default, the leftTable (tableA here) will be overwritten with the result.
await crossJoin("tableA", "tableB");
// But you can put the result into another table if needed.
await crossJoin("tableA", "tableB", { outputTable: "tableC" });
The name of the left table.
The name of the right table.
An optional object with configuration options:
Optional
outputThe name of the table that will be created or replaced with the result of the cross join.
Merges the data of two tables based on a common column. With SimpleNodeDB, it might create a .tmp folder, so make sure to add .tmp to your gitignore.
// By default, the method automatically looks for a common column in the two tables and does a left join of tableA (left) and tableB (right). The leftTable (tableA here) will be overwritten with the result.
await sdb.join("tableA", "tableB")
// You can change the common column, the join type, and the output table in options.
await sdb.join("tableA", "tableB", { commonColumn: 'id', type: 'inner', outputTable: 'tableC' })
The name of the left table to be joined.
The name of the right table to be joined.
An optional object with configuration options:
Optional
commonThe common column used for the join operation. By default, the method automatically searches for a column name that exists in both tables.
Optional
outputThe name of the new table that will store the result of the join operation. Default is the leftTable.
Optional
type?: "inner" | "left" | "right" | "full"The type of join operation to perform. Possible values are "inner", "left", "right", or "full". Default is "left".
Restructures a table by stacking values. Useful to tidy up data.
As an example, let's use this as tableA. Let's say it shows the number of employees per year in different departments.
Department | 2021 | 2022 | 2023 |
---|---|---|---|
Accounting | 10 | 9 | 15 |
Sales | 52 | 75 | 98 |
We restructure it by putting all years into a column Year and the employees counts into a column Employees.
await sdb.longer("tableA", ["2021", "2022", "2023"], "year", "employees")
Now, the table looks like this and is longer.
Department | Year | Employees |
---|---|---|
Accounting | 2021 | 10 |
Accounting | 2022 | 9 |
Accounting | 2023 | 15 |
Sales | 2021 | 52 |
Sales | 2022 | 75 |
Sales | 2023 | 98 |
The name of the table to be restructured.
The column names (and associated values) that we want to stack.
The new column in which the stacked columns' names will be put into.
The new column in which the stacked columns' values will be put into.
Removes one or more columns from a table.
await sdb.removeColumns("tableA", ["column1", "column2"])
The name of the table from which columns will be removed.
The name or an array of names of the columns to be removed.
Renames columns in a specified table.
// Renaming "How old?" to "age" and "Man or woman?" to "sex" in tableA.
await sdb.renameColumns("tableA", {"How old?" : "age", "Man or woman?": "sex"})
The table in which columns will be renamed.
An object mapping old column names to new column names.
Restructures a table by unstacking values.
As an example, let's use this as tableA. Let's say it shows the number of employees per year in different departments.
Department | Year | Employees |
---|---|---|
Accounting | 2021 | 10 |
Accounting | 2022 | 9 |
Accounting | 2023 | 15 |
Sales | 2021 | 52 |
Sales | 2022 | 75 |
Sales | 2023 | 98 |
We restructure it by making a new column for each year and with the associated employees counts as values.
await sdb.longer("tableA", "Year", "Employees")
Now, the table looks like this and is wider.
Department | 2021 | 2022 | 2023 |
---|---|---|---|
Accounting | 10 | 9 | 15 |
Sales | 52 | 75 | 98 |
The name of the table to be restructured.
The column containing the values that will be transformed into columns.
The column containing values to be spread across the new columns.
Filters rows from a table based on SQL conditions. Note that it's often faster to use the removeRows method.
// In table store, keep only rows where the fruit is not an apple.
await sdb.filter("store", "fruit != 'apple'")
// More examples:
await sdb.filter("store", "price > 100 AND quantity > 0")
await sdb.filter("inventory", "category = 'Electronics' OR category = 'Appliances'")
await sdb.filter("customers", "lastPurchaseDate >= '2023-01-01'")
The name of the table from which rows will be filtered.
The filtering conditions specified as a SQL WHERE clause.
Keeps rows with specific values in specific columns.
// In table employees, keep only rows where the job is accountant or developer and where the city is Montreal.
await sdb.keep("employees", { job: [ "accountant", "developer" ], city: [ "Montreal" ] })
The name of the table
An object with the columns (keys) and the values to be kept (values as arrays)
Remove rows with specific values in specific columns.
// In table employees, remove rows where the job is accountant or developer and where the city is Montreal.
await sdb.remove("employees", { job: [ "accountant", "developer" ], city: [ "Montreal" ] })
The name of the table
An object with the columns (keys) and the values to be removed (values as arrays)
Removes duplicate rows from a table, keeping unique rows. Note that SQL does not guarantee any specific order when using DISTINCT. So the data might be returned in a different order than the original.
await sdb.removeDuplicates("tableA")
The name of the table from which duplicates will be removed.
An optional object with configuration options:
Optional
on?: string | string[]A column or multiple columns to consider to remove duplicates. The other columns in the table will not be considered to exclude duplicates.
Removes rows with missing values from a table. By default, missing values are NULL (as an SQL value), but also "NULL", "null", "NaN" and "undefined" that might have been converted to strings before being loaded into the table. Empty strings "" are also considered missing values.
// Removes rows with missing values in any columns.
await sdb.removeMissing("tableA")
// Removes rows with missing values in specific columns.
await sdb.removeMissing("tableA", { columns: ["firstName", "lastName"]})
The name of the table from which rows with missing values will be removed.
An optional object with configuration options:
Optional
columns?: string | string[]Either a string or an array of strings specifying the columns to consider for missing values. By default, all columns are considered.
Optional
invert?: booleanA boolean indicating whether to invert the condition, keeping only rows with missing values. Defaults to false.
Optional
missingAn array of values to be treated as missing values. Defaults to ["undefined", "NaN", "null", "NULL", ""].
Removes rows from a table based on SQL conditions.
// In table store, remove rows where the fruit is an apple.
await sdb.removeRows("store", "fruit = 'apple'")
The name of the table from which rows will be removed.
The filtering conditions specified as a SQL WHERE clause.
Selects random rows from a table and removes the others.
// Selects 100 random rows in tableA
await sdb.sample("tableA", 100)
// Selects 10% of the rows randomly in tableB
await sdb.sample("tableB", "10%")
The name of the table from which rows will be sampled.
The number of rows (1000 for example) or a string ("10%" for example) specifying the sampling size.
An optional object with configuration options:
Optional
seed?: numberA number specifying the seed for repeatable sampling. For example, setting it to 1 will ensure random rows will be the same each time you run the method.
Selects specific columns in a table and removes the others.
// Selecting only the columns firstName and lastName from tableA. All other columns in the table will be removed.
await sdb.selectColumns("tableA", ["firstName", "lastName"])
The name of the table from which columns will be selected.
Either a string (one column) or an array of strings (multiple columns) representing the columns to be selected.
Selects n rows from a table. An offset and outputTable options are available.
// Selects the first 100 rows from tableA.
await sdb.selectRows("tableA", 100)
// Selects 100 rows from tableA, after skipping the first 100 rows.
await sdb.selectRows("tableA", 100, {offset: 100})
// Selects 100 rows from tableA and stores them in tableB.
await sdb.selectRows("tableA", 100, {outputTable: "tableB"})
The name of the table from which rows will be selected.
The number of rows.
An optional object with configuration options:
Optional
offset?: numberThe number of rows to skip before selecting. Defaults to 0.
Optional
outputThe name of the table that will be created or replaced and where the new rows will be stored. By default, the original table is overwritten.
Concatenates values from specified columns into a new column in a table.
// Concatenates values from column1 and column2 into column3
await sdb.concatenate("tableA", ["column1", "column2"], "column3")
// Same thing, but the values will be separated by a dash
await sdb.concatenate("tableA", ["column1", "column2"], "column3", {separator: "-"})
The name of the table where concatenation will occur.
An array of column names from which values will be concatenated.
The name of the new column to store the concatenated values.
An optional object with configuration options:
Optional
separator?: stringThe string used to separate concatenated values. Defaults to an empty string.
Extracts a specific number of characters, starting from the left.
// Strings in column1 of tableA will be replaced by the first two characters of each string.
await sdb.left("tableA", "column1", 2)
The name of the table
The name of the column storing the strings
The number of characters, starting from the left
Formats strings to lowercase.
// Just in one column.
await sdb.lower("tableA", "column1")
// In multiple columns
await sdb.lower("tableA", ["column1", "column2"])
The name of the table in which strings will be formatted to lowercase.
Either a string or an array of strings specifying the columns to be updated.
Replaces specified strings in the selected columns of a table.
// Replaces entire strings and substrings too.
await sdb.replace("tableA", "column1", {"kilograms": "kg", liters: "l" })
// Replaces only if matching entire string.
await sdb.replace("tableA", "column1", {"kilograms": "kg", liters: "l" }, {entireString: true})
// Replaces using a regular expression. Any sequence of one or more digits would be replaced by a hyphen.
await sdb.replace("tableA", "column1", {"\d+": "-" }, {regex: true})
The name of the table in which strings will be replaced.
Either a string or an array of strings specifying the columns where string replacements will occur.
An object mapping old strings to new strings.
An optional object with configuration options:
Optional
entireA boolean indicating whether the entire string must match for replacement. Defaults to false.
Optional
regex?: booleanA boolean indicating the use of regular expressions for a global replace. See the RE2 docs for the syntax. Defaults to false.
Replaces null values in the selected columns of a table.
// Replace null values by 0.
await sdb.replaceNulls("tableA", "column1", 0)
The name of the table in which strings will be replaced.
Either a string or an array of strings specifying the columns where string replacements will occur.
The value to replace the null values.
Extracts a specific number of characters, starting from the right.
// Strings in column1 of tableA will be replaced by the last two characters of each string.
await sdb.right("tableA", "column1", 2)
The name of the table
The name of the column storing the strings
The number of characters, starting from the right
Rounds numeric values in specified columns of a table.
// Rounds column1's values to the nearest integer.
await sdb.round("tableA", "column1")
// Rounds column1's values with a specific number of decimal places.
await sdb.round("tableA", "column1", {decimals: 2})
// Rounds column1's values with a specific method. Available methods are "round", "floor" and "ceiling".
await sdb.round("tableA", "column1", {method: "floor"})
The name of the table where numeric values will be rounded.
Either a string or an array of strings specifying the columns containing numeric values to be rounded.
An optional object with configuration options:
Optional
decimals?: numberThe number of decimal places to round to. Defaults to 0.
Optional
method?: "round" | "ceiling" | "floor"The rounding method to use. Defaults to "round".
Sorts the rows of a table based on specified column(s) and order(s).
// Sorts column1 ascendingly then column2 descendingly.
await sdb.sort("tableA", {column1: "asc", column2: "desc"})
// Same thing but taking French accent into account.
await sdb.sort("tableA", {column1: "asc", column2: "desc"}, {lang: {column1: "fr"}})
The name of the table to sort.
An object mapping column names to the sorting order: "asc" for ascending or "desc" for descending.
An optional object with configuration options:
Optional
lang?: { An object mapping column names to language codes. See DuckDB Collations documentation for more: https://duckdb.org/docs/sql/expressions/collations.
Splits strings along a separator and replaces the values with a substring at a specified index (starting at 0). If the index is outside the bounds of the list, return an empty string.
// Splits on commas and replaces values with the second substring.
await sdb.splitExtract("tableA", "column1", ",", 1)
The name of the table
The name of the column storing the strings
The substring to use as a separator
The index of the substring to replace values
Trims specified characters from the beginning, end, or both sides of string values.
// Trims values in column1
await sdb.trim("tableA", "column1")
// Trims values in column2, columns3, and column4
await sdb.trim("tableA", ["column2", "column3", "column4"])
The name of the table.
The column or columns to trim.
An optional object with configuration options:
Optional
character?: stringThe string to trim. Defaults to whitespace.
Optional
method?: "trim" | "leftTrim" | "rightTrim"The trimming method.
Updates values in a specified column in a given table.
await sdb.updateColumn("tableA", "column1", `LEFT(column2)`)
The name of the table.
The name of the column to be updated.
The SQL expression to set the new values in the column.
Updates data in a table using a JavaScript function. The function takes the existing rows as an array of objects and must return them modified as an array of objects. This method provides a flexible way to update data, but it's slow.
// Adds one to the values from column1 in tableA. If the values are not numbers, they are replaced by null.
await sdb.updateWithJS("tableA", (rows) => {
const modifiedRows = rows.map(d => ({
...d,
column1: typeof d.column1 === "number" ? d.column1 + 1 : null
}))
return modifiedRows
})
The name of the table to update.
A function that takes the existing rows and returns modified rows using JavaScript logic. The original rows are objects in an array and the modified rows must be returned as an array of objects too.
Formats strings to uppercase.
// Just in one column.
await sdb.upper("tableA", "column1")
// In multiple columns
await sdb.upper("tableA", ["column1", "column2"])
The name of the table in which strings will be formatted to uppercase.
Either a string or an array of strings specifying the columns to be updated.
Clones an existing table by creating or replacing a table with the same structure and data. The data can be optionally filtered. This can be very slow with big tables.
// tableA data is cloned into tableB.
await sdb.cloneTable("tableA", "tableB")
// tableA data is cloned into tableB. Only rows with values greater than 10 in column1 are kept.
await sdb.cloneTable("tableA", "tableB", {condition: `column1 > 10`})
The name of the table to be cloned.
The name of the new table that will be created as a clone.
An optional object with configuration options:
Optional
condition?: stringA SQL WHERE clause condition to filter the data. Defaults to no condition.
Executes a custom SQL query, providing flexibility for advanced users.
// You can use the returnDataFrom option to retrieve the data from the query, if needed.
await sdb.customQuery( "SELECT * FROM employees WHERE Job = 'Clerk'", {returnDataFrom: "query"})
The custom SQL query to be executed.
An optional object with configuration options:
Optional
returnSpecifies whether to return data from the "query" or not. Defaults to "none".
Optional
table?: stringThe name of the table associated with the query (if applicable). Needed when debug is true.
Returns descriptive information about the columns of a specified table, including details like data types, number of null and distinct values. Best to look at with console.table.
const description = await sdb.getDescription("tableA")
The name of the table.
Returns the schema (column names and their data types) of a specified table.
const schema = await sdb.getSchema("tableA")
The name of the table for which to retrieve the schema.
Returns true if a specified column exists in a given table and false if not.
const hasColumnSalary = await sdb.hasColumn("employees", "salary")
The name of the table.
The name of the column to check for existence.
Logs a specified number of rows from a table. Default is 10 rows.
// Logs first 10 rows
await sdb.logTable("tableA");
// Logs first 100 rows
await sdb.logTable("tableA", {nbRowsToLog: 100});
The name of the table.
An optional object with configuration options:
Optional
nbThe number of rows to log when debugging. Defaults to 10 or the value set in the SimpleDB instance.
SimpleDB is a class that provides a simplified interface for working with DuckDB, a high-performance in-memory analytical database. This class is meant to be used in a web browser. For NodeJS and similar runtimes, use SimpleNodeDB.
Here's how to instantiate a SimpleDB instance.
The start() method will be called internally automatically with the first method you'll run. It initializes DuckDB and establishes a connection to the database.