SimpleNodeDB is a class that provides a simplified interface for working with DuckDB, a high-performance, in-memory analytical database. This class is meant to be used with NodeJS and similar runtimes. For web browsers, use SimpleDB.

Here's how to instantiate a SimpleNodeDB instance.

const sdb = new SimpleNodeDB()

// Same thing but will log useful information in the terminal. The first 20 rows of tables will be logged. Also installs the spatial extension for geospatial analysis.
const sdb = new SimpleDB({ debug: true, nbRowsToLog: 20, spatial: true})

The start() method will be called internally automatically with the first method you'll run. It initializes DuckDB and establishes a connection to the database. It optionally loads the spatial extension.

With very expensive computations, it might create a .tmp folder, so make sure to add .tmp to your gitignore.

Hierarchy (view full)

Properties

bigIntToInt: undefined | boolean

A flag for SimpleNodeDB. Default is true. When data is retrieved from the database as an array of objects, BIGINT values are automatically converted to integers, which are easier to work with in JavaScript. If you want actual bigint values, set this option to false.

connection: AsyncDuckDBConnection | Connection

A connection to a DuckDB database.

db: AsyncDuckDB | Database

A DuckDB database.

debug: boolean

A flag indicating whether debugging information should be logged. Defaults to false.

nbRowsToLog: number

The number of rows to log when debugging. Defaults to 10.

runQuery: ((query, connection, returnDataFromQuery, options) => Promise<null | {
    [key: string]: number | string | Date | boolean | null;
}[]>)

For internal use only. If you want to run a SQL query, use the customQuery method.

Type declaration

    • (query, connection, returnDataFromQuery, options): Promise<null | {
          [key: string]: number | string | Date | boolean | null;
      }[]>
    • Parameters

      • query: string
      • connection: AsyncDuckDBConnection | Connection
      • returnDataFromQuery: boolean
      • options: {
            bigIntToInt?: boolean;
            debug: boolean;
            method: null | string;
            parameters: null | {
                [key: string]: unknown;
            };
            spatial?: boolean;
        }
        • Optional bigIntToInt?: boolean
        • debug: boolean
        • method: null | string
        • parameters: null | {
              [key: string]: unknown;
          }
        • Optional spatial?: boolean

      Returns Promise<null | {
          [key: string]: number | string | Date | boolean | null;
      }[]>

spatial: undefined | boolean

A flag to install the the spatial extension. Default is false. If true, extension will be loaded, which allows geospatial analysis.

worker: null | Worker

A worker to make DuckDB work.

Analyzing data

  • Assigns bins for specified column values within a table, based on an interval size.

    // Assigns a bin for each row in new column bins based on column1 values, with an interval of 10.
    await sdb.bins("tableA", "column1", 10, "bins")
    // If the minimum value in column1 is 5, the bins will follow this pattern: "[5-14]", "[15-24]", "[25-34]", etc.

    // Same thing, but with the bins starting at a specific value.
    await sdb.bins("tableA", "column1", 10, "bins", {startValue: 0})
    // The bins will follow this pattern: "[0-9]", "[10-19]", "[20-29]", etc.

    Parameters

    • table: string

      The name of the table for which bins will be computed.

    • values: string

      The column containing values from which bins will be computed.

    • interval: number

      The interval size for binning the values.

    • newColumn: string

      The name of the new column where the bins will be stored.

    • options: {
          startValue?: number;
      } = {}

      An optional object with configuration options:

      • Optional startValue?: number

        The starting value for binning. Defaults to the minimum value in the specified column.

    Returns Promise<void>

  • Calculates correlations between columns in a table.

    If no x and y columns are specified, the method computes the correlations of all numeric columns combinations. It's important to note that correlation is symmetrical: the correlation of x over y is the same as y over x.

    // Computes all correlations between all numeric columns in tableA and overwrite tableA with the results.
    await sdb.correlations("tableA")

    // Same but results are stored in tableB.
    await sdb.correlations("tableA", {outputTable: "tableB"})

    // Computes all correlations between a specific x column and all other numeric columns.
    await sdb.correlations("tableA", {x: "column1"})

    // Computes the correlations between a specific x and y columns.
    await sdb.correlations("tableA", {x: "column1", y: "column2"})

    Parameters

    • table: string

      The name of the table.

    • options: {
          categories?: string | string[];
          decimals?: number;
          outputTable?: string;
          x?: string;
          y?: string;
      } = {}

      An optional object with configuration options:

      • Optional categories?: string | string[]

        The column or columns that define categories. Correlation calculations will be run for each category.

      • Optional decimals?: number

        The number of decimal places to round the correlation values.

      • Optional outputTable?: string

        An option to store the results in a new table.

      • Optional x?: string

        The column name for the x values. Default is all numeric columns.

      • Optional y?: string

        The column name for the y values. Default is all numeric columns.

    Returns Promise<void>

  • Performs linear regression analysis and creates a table with regression results. The results include the slope, the y-intercept the R-squared.

    If no x and y columns are specified, the method computes the linear regression analysis of all numeric columns permutations. It's important to note that linear regression analysis is asymmetrical: the linear regression of x over y is not the same as y over x.

    // Computes all linear regressions between all numeric columns in tableA and overwrites tableA.
    await sdb.linearRegressions("tableA")

    // Same but stores the results in tableB.
    await sdb.linearRegressions("tableA", {outputTable: "tableB"})

    // Computes all linear regressions between a specific x column and all other numeric columns.
    await sdb.linearRegressions("tableA", {x: "column1"})

    // Computes the linear regression between a specific x and y columns.
    await sdb.linearRegressions("tableA", {x: "column1", y: "column2"})

    Parameters

    • table: string

      The name of the table.

    • options: {
          categories?: string | string[];
          decimals?: number;
          outputTable?: string;
          x?: string;
          y?: string;
      } = {}

      An optional object with configuration options:

      • Optional categories?: string | string[]

        The column or columns that define categories. Correlation calculations will be run for each category.

      • Optional decimals?: number

        The number of decimal places to round the regression coefficients.

      • Optional outputTable?: string
      • Optional x?: string

        The column name for the independent variable (x values) in the linear regression analysis.

      • Optional y?: string

        The column name for the dependent variable (y values) in the linear regression analysis.

    Returns Promise<void>

  • Normalizes the values in a column using min-max normalization.

    // Normalizes the values in the column1 from tableA.
    await sdb.normalize("tableA", "column1")

    Parameters

    • table: string

      The name of the table.

    • column: string

      The name of the column in which values will be normalized.

    • newColumn: string

      The name of the new column where normalized values will be stored.

    • options: {
          categories?: string | string[];
          decimals?: number;
      } = {}

      An optional object with configuration options:

      • Optional categories?: string | string[]

        The column or columns that define categories for the normalization.

      • Optional decimals?: number

        The number of decimal places to round the normalized values.

    Returns Promise<void>

  • Identifies outliers using the Interquartile Range (IQR) method.

    // Looks for outliers in column age from table1. Creates a new column outliers with TRUE or FALSE values.
    await sdb.outliersIQR("table1", "age", "outliers")

    Parameters

    • table: string

      The name of the table containing the column for outlier detection.

    • column: string

      The name of the column in which outliers will be identified.

    • newColumn: string

      The name of the new column where the bins will be stored.

    • options: {
          categories?: string | string[];
      } = {}

      An optional object with configuration options:

      • Optional categories?: string | string[]

        The column or columns that define categories for outliers.

    Returns Promise<void>

  • Computes proportions within a row for specified columns in a given table.

    For example, let's say this is tableA.

    Year Men Women NonBinary
    2021 564 685 145
    2022 354 278 56
    2023 856 321 221

    We compute the proportions of men, women, and non-binary on each row.

    await sdb.proportionsHorizontal("tableA", ["Men", "Women", "NonBinary"])
    

    The table now looks like this.

    Year Men Women NonBinary MenPerc WomenPerc NonBinaryPerc
    2021 564 685 145 0.4 0.49 0.10
    2022 354 278 56 0.51 0.4 0.08
    2023 856 321 221 0.61 0.23 0.16

    By default, the new columns have the suffix "Perc", but you use something else if you want.

    await sdb.proportionsHorizontal("tableA", ["Men", "Women", "NonBinary"], {suffix: "Prop"})
    

    Here's the result with a different suffix.

    Year Men Women NonBinary MenProp WomenProp NonBinaryProp
    2021 564 685 145 0.4 0.49 0.10
    2022 354 278 56 0.51 0.4 0.08
    2023 856 321 221 0.61 0.23 0.16

    Parameters

    • table: string

      The name of the table.

    • columns: string[]

      The columns for which proportions will be computed on each row.

    • options: {
          decimals?: number;
          suffix?: string;
      } = {}

      An optional object with configuration options:

      • Optional decimals?: number

        The number of decimal places to round the computed proportions.

      • Optional suffix?: string

        A string suffix to append to the names of the new columns storing the computed proportions. Defaults to "Perc".

    Returns Promise<void>

  • Computes proportions over a column's values within a table.

    // This will add a column perc with the result of each column1 value divided by the sum of all column1 values.
    await sdb.proportionsVertical("tableA", "column1", "perc")

    Parameters

    • table: string

      The name of the table.

    • column: string

      The column containing values for which proportions will be computed. The proportions are calculated based on the sum of values in the specified column.

    • newColumn: string

      The name of the new column where the proportions will be stored.

    • options: {
          categories?: string | string[];
          decimals?: number;
      } = {}

      An optional object with configuration options:

      • Optional categories?: string | string[]

        The column or columns that define categories for computing proportions. This can be a single column name or an array of column names.

      • Optional decimals?: number

        The number of decimal places to round the computed proportions.

    Returns Promise<void>

  • Assigns quantiles for specified column values within a table.

    // Assigns a quantile from 1 to 10 for each row in new column quantiles, based on values from column1.
    await sdb.quantiles("tableA", "column1", 10, "quantiles")

    // Same thing, except the values in column2 are used as categories.
    await sdb.quantiles("tableA", "column1", 10, "quantiles", {categories: "column2"})

    Parameters

    • table: string

      The name of the table.

    • values: string

      The column containing values from which quantiles will be assigned.

    • nbQuantiles: number

      The number of quantiles.

    • newColumn: string

      The name of the new column where the assigned quantiles will be stored.

    • options: {
          categories?: string | string[];
      } = {}

      An optional object with configuration options:

      • Optional categories?: string | string[]

        The column or columns that define categories for computing quantiles. This can be a single column name or an array of column names.

    Returns Promise<void>

  • Assigns ranks in a new column based on specified column values within a table.

    // Computes ranks in the new column rank from the column1 values.
    await sdb.ranks("tableA", "column1", "rank")

    // Computing ranks in the new column rank from the column1 values. Using the values from column2 as categories.
    await sdb.ranks("tableA", "column1", "rank", {categories: "column2"})

    Parameters

    • table: string

      The name of the table.

    • values: string

      The column containing values to be used for ranking.

    • newColumn: string

      The name of the new column where the ranks will be stored.

    • options: {
          categories?: string | string[];
          noGaps?: boolean;
      } = {}

      An optional object with configuration options:

      • Optional categories?: string | string[]

        The column or columns that define categories for ranking.

      • Optional noGaps?: boolean

        A boolean indicating whether to assign ranks without gaps. Defaults to false.

    Returns Promise<void>

  • Computes rolling aggregations, like a rolling average. For rows without enough preceding or following rows, returns NULL. For this method to work properly, don't forget to sort your data first.

    // 7-day rolling average of values in column1 with the 3 preceding and 3 following rows.
    await sdb.rolling("tableA", "column1", "rollingAvg", "mean", 3, 3)

    Parameters

    • table: string

      The name of the table

    • column: string

      The name of the column storing the values to be aggregated

    • newColumn: string

      The name of the new column in which the computed values will be stored

    • summary: "min" | "max" | "mean" | "median" | "sum"

      How to aggregate the values

    • preceding: number

      How many preceding rows to include

    • following: number

      How many following rows to include

    • options: {
          categories?: string | string[];
          decimals?: number;
      } = {}

      An optional object with configuration options:

      • Optional categories?: string | string[]

        The category or categories to be used for the aggragation.

      • Optional decimals?: number

        The number of decimal places to round the summarized values.

    Returns Promise<void>

  • Creates a summary table based on specified values, categories, and summary operations.

    // Summarizes all numeric columns with all available summary operations. Table tableA will be overwritten with the results.
    await sdb.summarize("tableA")

    // Same, but the results will be stored in tableB.
    await sdb.summarize("tableA", {outputTable: "tableB"})

    // Summarizes a specific column with all available summary operations. Values can be an array of column names, too.
    await sdb.summarize("tableA", {values: "column1"})

    // Summarizes a specific column with all available summary operations and use the values in another column as categories. Categories can be an array of column names, too.
    await sdb.summarize("tableA", {values: "column1", categories: "column2"})

    // Summarizes a specific column with a specific summary operation and use the values in another column as categories. Summaries can be an array of summary operations, too.
    await sdb.summarize("tableA", {values: "column1", categories: "column2", summaries: "mean"})

    // Summarizes and round values with a specific number of decimal places.
    await sdb.summarize("tableA", {values: "column1", categories: "column2", summaries: "mean", decimals: 4})

    Parameters

    • table: string

      The name of the table to be summarized.

    • options: {
          categories?: string | string[];
          decimals?: number;
          outputTable?: string;
          summaries?: "min" | "max" | "mean" | "median" | "sum" | "count" | "countUnique" | "skew" | "stdDev" | "var" | ("min" | "max" | "mean" | "median" | "sum" | "count" | "countUnique" | "skew" | "stdDev" | "var")[];
          values?: string | string[];
      } = {}

      An optional object with configuration options:

      • Optional categories?: string | string[]

        The column or columns that define categories for the summarization. This can be a single column name or an array of column names.

      • Optional decimals?: number

        The number of decimal places to round the summarized values.

      • Optional outputTable?: string

        An option to store the results in a new table.

      • Optional summaries?: "min" | "max" | "mean" | "median" | "sum" | "count" | "countUnique" | "skew" | "stdDev" | "var" | ("min" | "max" | "mean" | "median" | "sum" | "count" | "countUnique" | "skew" | "stdDev" | "var")[]

        The summary operations to be performed. This can be a single summary operation or an array of summary operations.

      • Optional values?: string | string[]

        The column or columns whose values will be summarized. This can be a single column name or an array of column names.

    Returns Promise<void>

  • Computes the Z-score.

    // Calculates the Z-score for the values in column age and puts the results in column sigma.
    await sdb.zScore("table1", "age", "sigma")

    Parameters

    • table: string

      The name of the table.

    • column: string

      The name of the column for which Z-Score will be calculated.

    • newColumn: string

      The name of the new column where the bins will be stored.

    • options: {
          categories?: string | string[];
          decimals?: number;
      } = {}

      An optional object with configuration options:

      • Optional categories?: string | string[]

        The column or columns that define categories for zScores.

      • Optional decimals?: number

        The number of decimal places to round the Z-score values.

    Returns Promise<void>

Exporting data

  • Writes data from a table to a file.

    await sdb.writeData("tableA", "output/data.csv");
    

    Parameters

    • table: string

      The name of the table from which data will be written.

    • file: string

      The path to the file to which data will be written.

    • options: {
          compression?: boolean;
      } = {}

      An optional object with configuration options:

      • compression: A boolean indicating whether to compress the output file. Defaults to false. If true, CSV and JSON files will be compressed with GZIP while PARQUET files will use ZSTD.
      • Optional compression?: boolean

    Returns Promise<void>

  • Writes geospatial data from a table to a file.

    await sdb.writeGeoata("tableA", "output/data.geojson");
    

    Parameters

    • table: string

      The name of the table from which data will be written.

    • file: string

      The path to the file to which data will be written.

    • options: {
          precision?: number;
      } = {}

      An optional object with configuration options:

      • Optional precision?: number

        Maximum number of figures after decimal separator to write in coordinates.

    Returns Promise<void>

Geospatial

  • Computes the area of geometries. The values are returned in the SRS unit.

    // Computes the area of the geometries in the column geom from the table tableGeo, and returns the results in the column area.
    await sdb.area("tableGeo", "geom", "area")

    Parameters

    • table: string

      The name of the table storing the geospatial data.

    • column: string

      The name of the column storing the geometries.

    • newColumn: string

      The name of the new column storing the computed areas.

    Returns Promise<void>

  • Computes the centroid of geometries. The values are returned in the SRS unit.

    // Computes the centroid of the geometries in the column geom from the table tableGeo, and returns the results in the column centroid.
    await sdb.centroid("tableGeo", "geom", "centroid")

    Parameters

    • table: string

      The name of the table storing the geospatial data.

    • column: string

      The name of the column storing the geometries.

    • newColumn: string

      The name of the new column storing the centroids.

    Returns Promise<void>

  • Flips the coordinates of a geometry. Useful for some geojson files which have lat and lon inverted.

    await sdb.flipCoordinates("tableGeo", "geom")
    

    Parameters

    • table: string

      The name of the table storing the geospatial data.

    • column: string

      The name of the column storing the geometries.

    Returns Promise<void>

  • Returns true if all points of a geometry lies inside another geometry.

    // Checks if geometries in column geomA are inside geometries in column geomB and return true or false in new column isInside.
    await sdb.inside("tableGeo", ["geomA", "geomB"], "isInside")

    Parameters

    • table: string

      The name of the table storing the geospatial data.

    • columns: [string, string]

      The names of the two columns storing the geometries. The first column holds the geometries that will be tested for containment. The second column stores the geometries to be tested as containers.

    • newColumn: string

      The name of the new column with true or false values.

    Returns Promise<void>

  • Returns true if two geometries intersect.

    // Checks if geometries in geomA and in geomB intersect and return true or false in new column inter.
    await sdb.intersect("tableGeo", ["geomA", "geomB"], "inter")

    Parameters

    • table: string

      The name of the table storing the geospatial data.

    • columns: [string, string]

      The names of the two columns storing the geometries.

    • newColumn: string

      The name of the new column with true or false values.

    Returns Promise<void>

  • Computes the intersection of geometries.

    // Computes the intersection of geometries in geomA and geomB columns from table tableGeo and puts the new geometries in column inter.
    await sdb.intersection("tableGeo", ["geomA", "geomB"], "inter")

    Parameters

    • table: string

      The name of the table storing the geospatial data.

    • columns: [string, string]

      The names of the two columns storing the geometries.

    • newColumn: string

      The name of the new column storing the computed intersections.

    Returns Promise<void>

  • Checks if a geometry is valid.

    // Checks if the geometries in column geom from table tableGeo are valid and returns a boolean in column isValid.
    await sdb.isValidGeo("tableGeo", "geom", "isValid")

    Parameters

    • table: string

      The name of the table storing the geospatial data.

    • column: string

      The name of the column storing the geometries.

    • newColumn: string

      The name of the new column storing the results.

    Returns Promise<void>

  • Merges the data of two tables based on a spatial join. With SimpleNodeDB, it might create a .tmp folder, so make sure to add .tmp to your gitignore.

    // Merges data of tableA and tableB based on geometries that intersect in tableA and tableB. By default, the method looks for columns named 'geom' storing the geometries in the tables, does a left join and overwrites leftTable (tableA) with the results. The method also appends the name of the table to the 'geom' columns in the returned data.
    await sdb.joinGeo("tableA", "intersect", "tableB",)

    // Same thing but with specific column names storing geometries, a specific join type, and returning the results in a new table.
    await sdb.joinGeo("tableA", "intersect", "tableB", {geoColumnLeft: "geometriesA", geoColumnRight: "geometriesB", type: "inner", outputTable: "tableC"})

    // Merges data based on geometries in table A that are inside geometries in table B. The table order is important.
    await sdb.joinGeo("tableA", "inside", "tableB")

    Parameters

    • leftTable: string

      The name of the left table to be joined.

    • method: "intersect" | "inside"

      The method for the spatial join.

    • rightTable: string

      The name of the right table to be joined.

    • options: {
          columnLeftTable?: string;
          columnRightTable?: string;
          outputTable?: string;
          type?: "inner" | "left" | "right" | "full";
      } = {}

      An optional object with configuration options:

      • Optional columnLeftTable?: string

        The column storing the geometries in leftTable. It's 'geom' by default.

      • Optional columnRightTable?: string

        The column storing the geometries in rightTable. It's 'geom' by default.

      • Optional outputTable?: string

        The name of the new table that will store the result of the join operation. Default is the leftTable.

      • Optional type?: "inner" | "left" | "right" | "full"

        The type of join operation to perform. For some methods (like 'inside'), the table order is important.

    Returns Promise<void>

  • Extracts the latitude and longitude of points.

    // Extracts the latitude and longitude of points from the points in the "geom" column from "tableGeo" and put them in the columns "lat" and "lon"
    await sdb.latLon("tableGeo", "geom", ["lat", "lon"])

    Parameters

    • table: string

      The name of the table storing the geospatial data.

    • column: string

      The name of the table storing the points.

    • newColumns: [string, string]

      The names the columns storing the latitude and the longitude, in this order.

    Returns Promise<void>

  • Creates or replaces a table and loads geospatial data from an external file into it.

    // With a URL
    await sdb.loadGeoData("tableGeo", "https://some-website.com/some-data.geojson")

    // With a local file
    await sdb.loadGeoData("tableGeo", "./some-data.geojson")

    Parameters

    • table: string

      The name of the new table.

    • file: string

      The URL or path to the external file containing the geospatial data.

    Returns Promise<void>

  • Removes the intersection of two geometries.

    // Removes the intersection of geomA and geomB from geomA and returns the results in the new column noIntersection.
    await sdb.removeIntersection("tableGeo", ["geomA", "geomB"], "noIntersection")

    Parameters

    • table: string

      The name of the table storing the geospatial data.

    • columns: [string, string]

      The names of the two columns storing the geometries. The order is important.

    • newColumn: string

      The name of the new column storing the new geometries.

    Returns Promise<void>

  • Reprojects the data from one Spatial Reference System (SRS) to another.

    // From EPSG:3347 (also called NAD83/Statistics Canada Lambert with coordinates in meters) to EPSG:4326 (also called WGS84, with lat and lon in degrees)
    await sdb.reproject("tableGeo", "geom", "EPSG:3347", "EPSG:4326")

    Parameters

    • table: string

      The name of the table storing the geospatial data.

    • column: string

      The name of the column storing the geometries.

    • from: string

      The original SRS.

    • to: string

      The target SRS.

    Returns Promise<void>

  • Simplifies the geometries while preserving their topology. The simplification occurs on an object-by-object basis.

    // Simplifies with a tolerance of 0.1.
    await sdb.simplify("tableGeo", "geomA", 0.1)

    Parameters

    • table: string

      The name of the table storing the geospatial data.

    • column: string

      The name of the column storing the geometries.

    • tolerance: number

      A number used for the simplification. A higher tolerance results in a more significant simplification.

    Returns Promise<void>

Getting data

  • Returns the bottom n rows from a table. The last row will be returned first. To keep the original order of the data, use the originalOrder option.

    // Last row will be returned first.
    const bottom10 = await sdb.getBottom("inventory", 10)

    // Last row will be returned last.
    const bottom10 = await sdb.getBottom("inventory", 10, {originalOrder: true})

    // With a condition
    const bottom10Books = await sdb.getBottom("inventory", 10, {condition: `category = 'Books'` })

    Parameters

    • table: string

      The name of the table.

    • count: number

      The number of rows to return.

    • options: {
          condition?: string;
          originalOrder?: boolean;
      } = {}

      An optional object with configuration options:

      • Optional condition?: string

        The filtering conditions specified as a SQL WHERE clause. Defaults to no condition.

      • Optional originalOrder?: boolean

        A boolean indicating whether the rows should be returned in their original order. Default is false, meaning the last row will be returned first.

    Returns Promise<{
        [key: string]: string | number | boolean | Date | null;
    }[]>

  • Returns the data from a specified table with an optional condition.

    // No condition. Returns all data.
    const data = await sdb.getData("inventory")

    // With condition
    const books = await sdb.getData("inventory", {condition: "category = 'Book'"})

    Parameters

    • table: string

      The name of the table from which to retrieve the data.

    • options: {
          condition?: string;
      } = {}

      An optional object with configuration options:

      • Optional condition?: string

        A SQL WHERE clause condition to filter the data. Defaults to no condition.

    Returns Promise<{
        [key: string]: string | number | boolean | Date | null;
    }[]>

  • Returns the first row from a table based on optional filtering conditions.

    // No condition.
    const firstRow = await sdb.getFirstRow("inventory")

    // With condition
    const firstRowBooks = await sdb.getFirstRow("inventory", {condition: "category = 'Book'"})

    Parameters

    • table: string

      The name of the table.

    • options: {
          condition?: string;
      } = {}

      An optional object with configuration options:

      • Optional condition?: string

        The filtering conditions specified as a SQL WHERE clause. Defaults to no condition.

    Returns Promise<{
        [key: string]: string | number | boolean | Date | null;
    }>

  • Returns the last row from a table based on optional filtering conditions.

    // No condition.
    const lastRow = await sdb.getLastRow("inventory")

    // With condition
    const lastRowBooks = await sdb.getLastRow("inventory", {condition: "category = 'Book'"})

    Parameters

    • table: string

      The name of the table.

    • options: {
          condition?: string;
      } = {}

      An optional object with configuration options:

      • Optional condition?: string

        The filtering conditions specified as a SQL WHERE clause. Defaults to no condition.

    Returns Promise<{
        [key: string]: string | number | boolean | Date | null;
    }>

  • Returns the maximum value from a specific column in a table.

    const maximum = sdb.getMax("tableA", "column1")
    

    Parameters

    • table: string

      The name of the table.

    • column: string

      The name of the column.

    Returns Promise<null | string | number | boolean | Date>

  • Returns the mean value from a specific column in a table.

    const mean = sdb.getMean("tableA", "column1")
    

    Parameters

    • table: string

      The name of the table.

    • column: string

      The name of the column.

    • options: {
          decimals?: number;
      } = {}

      An optional object with configuration options:

      • Optional decimals?: number

        The number of decimal places to round the result to.

    Returns Promise<number>

  • Returns the median value from a specific column in a table.

    const median = sdb.getMedian("tableA", "column1")
    

    Parameters

    • table: string

      The name of the table.

    • column: string

      The name of the column.

    • options: {
          decimals?: number;
      } = {}

      An optional object with configuration options:

      • Optional decimals?: number

        The number of decimal places to round the result to.

    Returns Promise<number>

  • Returns the minimum value from a specific column in a table.

    const minimum = sdb.getMin("tableA", "column1")
    

    Parameters

    • table: string

      The name of the table.

    • column: string

      The name of the column.

    Returns Promise<null | string | number | boolean | Date>

  • Returns the value of a specific quantile from the values in a given column of a table.

    const firstQuartile = sdb.getQuantile("tableA", "column1", 0.25)
    

    Parameters

    • table: string

      The name of the table.

    • column: string

      The name of the column from which to calculate the quantile.

    • quantile: number

      The quantile (between 0 and 1) to calculate. For example, 0.25 for the first quartile.

    • options: {
          decimals?: number;
      } = {}

      An optional object with configuration options:

      • Optional decimals?: number

        The number of decimal places to round the result to.

    Returns Promise<number>

  • Returns the skewness of values from a specific column in a table.

    const skew = sdb.getSkew("tableA", "column1")
    

    Parameters

    • table: string

      The name of the table.

    • column: string

      The name of the column.

    • options: {
          decimals?: number;
      } = {}

      An optional object with configuration options:

      • Optional decimals?: number

        The number of decimal places to round the result to.

    Returns Promise<number>

  • Returns the standard deviation of values from a specific column in a table.

    const standardDeviation = sdb.getStdDev("tableA", "column1")
    

    Parameters

    • table: string

      The name of the table.

    • column: string

      The name of the column.

    • options: {
          decimals?: number;
      } = {}

      An optional object with configuration options:

      • Optional decimals?: number

        The number of decimal places to round the result to.

    Returns Promise<number>

  • Returns the sum of values from a specific column in a table.

    const sum = sdb.getSum("tableA", "column1")
    

    Parameters

    • table: string

      The name of the table.

    • column: string

      The name of the column.

    Returns Promise<number>

  • Returns the top n rows from a table.

    const top10 = await sdb.getTop("inventory", 10)

    // With a condition
    const top10Books = await sdb.getTop("inventory", 10, {condition: `category = 'Books'` })

    Parameters

    • table: string

      The name of the table.

    • count: number

      The number of rows to return.

    • options: {
          condition?: string;
      } = {}

      An optional object with configuration options:

      • Optional condition?: string

        The filtering conditions specified as a SQL WHERE clause. Defaults to no condition.

    Returns Promise<{
        [key: string]: string | number | boolean | Date | null;
    }[]>

  • Returns unique values from a specific column in a table. For convenience, it returns the value ascendingly.

    const uniques = await sdb.getUniques("tableA", "column1")
    

    Parameters

    • table: string

      The name of the table.

    • column: string

      The name of the column from which to retrieve unique values.

    Returns Promise<(null | string | number | boolean | Date)[]>

  • Returns the values of a specific column in a table.

    const values = await sdb.getValues("tableA", "column1")
    

    Parameters

    • table: string

      The name of the table.

    • column: string

      The name of the column.

    Returns Promise<(null | string | number | boolean | Date)[]>

  • Returns the variance of values from a specific column in a table.

    const variance = sdb.getVar("tableA", "column1")
    

    Parameters

    • table: string

      The name of the table.

    • column: string

      The name of the column.

    • options: {
          decimals?: number;
      } = {}

      An optional object with configuration options:

      • Optional decimals?: number

        The number of decimal places to round the result to.

    Returns Promise<number>

Importing data

  • Inserts rows formatted as an array of objects into an existing table.

    const rows = [ { letter: "a", number: 1 }, { letter: "b", number: 2 }]
    await sdb.insertRows("tableA", rows)

    Parameters

    • table: string

      The name of the table to insert rows into.

    • rows: {
          [key: string]: unknown;
      }[]

      An array of objects representing the rows to be inserted into the table.

    Returns Promise<void>

  • Inserts all rows from one table (or multiple tables) into another existing table.

    // Insert all rows from tableB into tableA.
    await sdb.insertTable("tableA", "tableB")
    // Insert all rows from tableB and tableC into tableA.
    await sdb.insertTable("tableA", ["tableB", "tableC"])

    Parameters

    • table: string

      The name of the table to insert rows into.

    • tablesToInsert: string | string[]

      The name of the table(s) from which rows will be inserted.

    Returns Promise<void>

  • Creates or replaces a table and loads an array of objects into it.

    const data = [{letter: "a", number: 1}, {letter: "b", number: 2}]
    await simpleDB.loadArray("tableA", data)

    Parameters

    • table: string

      The name of the table to be created.

    • arrayOfObjects: {
          [key: string]: unknown;
      }[]

      An array of objects representing the data.

    Returns Promise<void>

  • Creates a table and loads data from local or remote file(s) into it. CSV, JSON, and PARQUET files are accepted.

    // Load data from a local file into tableA
    await sdb.loadData("tableA", "./some-data.csv")

    // Load data from multiple local files into tableB
    await sdb.loadData("tableB", ["./some-data1.json", "./some-data2.json", "./some-data3.json"])

    // Load data from a remote file into tableC
    await sdb.loadData("tableC", "https://some-website.com/some-data.parquet")

    // Load data from multiple remote files into tableD
    await sdb.loadData("tableD", ["https://some-website.com/some-data1.parquet", "https://some-website.com/some-data2.parquet", "https://some-website.com/some-data3.parquet"])

    Parameters

    • table: string

      The name of the table into which data will be loaded.

    • files: string | string[]

      The path(s) or url(s) of file(s) containing the data to be loaded. CSV, JSON, and PARQUET files are accepted.

    • options: {
          allText?: boolean;
          autoDetect?: boolean;
          columnTypes?: {
              [key: string]: string;
          };
          compression?: "none" | "gzip" | "zstd";
          delim?: string;
          fileName?: boolean;
          fileType?: "csv" | "dsv" | "json" | "parquet" | "excel";
          header?: boolean;
          jsonFormat?: "unstructured" | "newlineDelimited" | "array";
          limit?: number;
          records?: boolean;
          sheet?: string;
          skip?: number;
          unifyColumns?: boolean;
      } = {}

      An optional object with configuration options:

      • Optional allText?: boolean

        A boolean indicating whether all columns should be treated as text. Applicable to CSV files. Defaults to false.

      • Optional autoDetect?: boolean

        A boolean indicating whether to automatically detect the data format. Defaults to true.

      • Optional columnTypes?: {
            [key: string]: string;
        }

        An object mapping the column names with their expected types. By default, the types are inferred.

        • [key: string]: string
      • Optional compression?: "none" | "gzip" | "zstd"

        The compression type. Applicable to CSV files. Defaults to none.

      • Optional delim?: string

        The delimiter used in the file. Applicable to CSV and DSV files. By default, the delimiter is inferred.

      • Optional fileName?: boolean

        A boolean indicating whether to include the file name as a column in the loaded data. Defaults to false.

      • Optional fileType?: "csv" | "dsv" | "json" | "parquet" | "excel"

        The type of file to load ("csv", "dsv", "json", "parquet"). Defaults to the first file extension.

      • Optional header?: boolean

        A boolean indicating whether the file has a header. Applicable to CSV files. Defaults to true.

      • Optional jsonFormat?: "unstructured" | "newlineDelimited" | "array"

        The format of JSON files ("unstructured", "newlineDelimited", "array"). By default, the format is inferred.

      • Optional limit?: number

        A number indicating the number of rows to load. Defaults to all rows.

      • Optional records?: boolean

        A boolean indicating whether each line in a newline-delimited JSON file represents a record. Applicable to JSON files. By default, it's inferred.

      • Optional sheet?: string

        A string indicating a specific sheet to import. Applicable to Excel files. By default, the first sheet is imported.

      • Optional skip?: number

        The number of lines to skip at the beginning of the file. Applicable to CSV files. Defaults to 0.

      • Optional unifyColumns?: boolean

        A boolean indicating whether to unify columns across multiple files, when the files structure is not the same. Defaults to false.

    Returns Promise<void>

  • Creates a table and loads data from all files in a local directory. CSV, JSON, and PARQUET files are accepted.

    await sdb.loadDataFromDirectory("tableA", "./data/")
    

    Parameters

    • table: string

      The name of the table into which data will be loaded.

    • directory: string

      The path of the directory containing the data files to be loaded. CSV, JSON, and PARQUET files are accepted.

    • options: {
          allText?: boolean;
          autoDetect?: boolean;
          columnTypes?: {
              [key: string]: string;
          };
          delim?: string;
          fileName?: boolean;
          fileType?: "csv" | "dsv" | "json" | "parquet";
          header?: boolean;
          jsonFormat?: "unstructured" | "newlineDelimited" | "array";
          records?: boolean;
          skip?: number;
          unifyColumns?: boolean;
      } = {}

      An optional object with configuration options:

      • Optional allText?: boolean

        A boolean indicating whether all columns should be treated as text. Applicable to CSV files. Defaults to false.

      • Optional autoDetect?: boolean

        A boolean indicating whether to automatically detect the data format. Defaults to true.

      • Optional columnTypes?: {
            [key: string]: string;
        }

        An object mapping the column names with their expected types. By default, the types are inferred.

        • [key: string]: string
      • Optional delim?: string

        The delimiter used in the file. Applicable to CSV and DSV files. By default, the delimiter is inferred.

      • Optional fileName?: boolean

        A boolean indicating whether to include the file name as a column in the loaded data. Defaults to false.

      • Optional fileType?: "csv" | "dsv" | "json" | "parquet"

        The type of file to load ("csv", "dsv", "json", "parquet"). Defaults to the first file extension.

      • Optional header?: boolean

        A boolean indicating whether the file has a header. Applicable to CSV files. Defaults to true.

      • Optional jsonFormat?: "unstructured" | "newlineDelimited" | "array"

        The format of JSON files ("unstructured", "newlineDelimited", "array"). By default, the format is inferred.

      • Optional records?: boolean

        A boolean indicating whether each line in a newline-delimited JSON file represents a record. Applicable to JSON files. By default, it's inferred.

      • Optional skip?: number

        The number of lines to skip at the beginning of the file. Applicable to CSV files. Defaults to 0.

      • Optional unifyColumns?: boolean

        A boolean indicating whether to unify columns across multiple files, when the files structure is not the same. Defaults to false.

    Returns Promise<void>

Restructuring data

  • Adds a new column to a table based on a type (JavaScript or SQL types) and a SQL definition.

    // Adds column3 to tableA. The column's values are floats (equivalent to DOUBLE in SQL) and are the results of the sum of values from column1 and column2.
    await sdb.addColumn("tableA", "column3", "float", "column1 + column2")

    Parameters

    • table: string

      The name of the table to which the new column will be added.

    • newColumn: string

      The name of the new column to be added.

    • type: "string" | "number" | "bigint" | "boolean" | "integer" | "float" | "date" | "time" | "datetime" | "datetimeTz" | "double" | "varchar" | "timestamp" | "timestamp with time zone" | "geometry"

      The data type for the new column. JavaScript or SQL types.

    • definition: string

      SQL expression defining how the values should be computed for the new column.

    Returns Promise<void>

  • Adds a new column to a table with the row number.

    // Adds the row number in new column rowNumber.
    await sdb.addRowNumber("tableA", "rowNumber")

    Parameters

    • table: string

      The name of the table

    • newColumn: string

      The name of the new column storing the row number

    Returns Promise<void>

  • Clones a column in a given table.

    // Clones column1 as column2 from tableA
    await sdb.cloneColumn("tableA", "column1", "column2")

    Parameters

    • table: string

      The table in which the cloning should happen.

    • originalColumn: string

      The original column.

    • newColumn: string

      The name of the cloned column.

    Returns Promise<void>

  • Converts data types (JavaScript or SQL types) of specified columns in a table.

    // Converts column1 to string and column2 to integer
    await sdb.convert("tableA", {column1: "string", column2: "integer"})

    // Same thing but with SQL types
    await sdb.convert("tableA", {column1: "varchar", column2: "bigint"})

    // Converts a string to a date
    await sdb.convert("tableA", {column3: "datetime"}, {datetimeFormat: "%Y-%m-%d" })

    Parameters

    • table: string

      The name of the table where data types will be converted.

    • types: {
          [key: string]: "integer" | "float" | "number" | "string" | "date" | "time" | "datetime" | "datetimeTz" | "bigint" | "double" | "varchar" | "timestamp" | "timestamp with time zone" | "boolean";
      }

      An object mapping column names to the target data types for conversion.

      • [key: string]: "integer" | "float" | "number" | "string" | "date" | "time" | "datetime" | "datetimeTz" | "bigint" | "double" | "varchar" | "timestamp" | "timestamp with time zone" | "boolean"
    • options: {
          datetimeFormat?: string;
          try?: boolean;
      } = {}

      An optional object with configuration options:

      • Optional datetimeFormat?: string

        A string specifying the format for date and time conversions. The method uses strftime and strptime functions from DuckDB. For the format specifiers, see https://duckdb.org/docs/sql/functions/dateformat.

      • Optional try?: boolean

        When true, the values that can't be converted will be replaced by NULL instead of throwing an error. Defaults to false.

    Returns Promise<void>

  • Creates a new empty table with specified columns and data types (JavaScript or SQL). If the table already exists, it will be overwritten.

     await sdb.createTable("employees", {
    name: "string",
    salary: "integer",
    raise: "float",
    })

    Parameters

    • table: string

      The name of the table.

    • types: {
          [key: string]: "integer" | "float" | "number" | "string" | "date" | "time" | "datetime" | "datetimeTz" | "bigint" | "double" | "varchar" | "timestamp" | "timestamp with time zone" | "boolean" | "geometry";
      }

      An object specifying the columns and their data types (JavaScript or SQL).

      • [key: string]: "integer" | "float" | "number" | "string" | "date" | "time" | "datetime" | "datetimeTz" | "bigint" | "double" | "varchar" | "timestamp" | "timestamp with time zone" | "boolean" | "geometry"

    Returns Promise<void>

  • Performs a cross join operation between two tables returning all pairs of rows. With SimpleNodeDB, it might create a .tmp folder, so make sure to add .tmp to your gitignore.

    // By default, the leftTable (tableA here) will be overwritten with the result.
    await crossJoin("tableA", "tableB");

    // But you can put the result into another table if needed.
    await crossJoin("tableA", "tableB", { outputTable: "tableC" });

    Parameters

    • leftTable: string

      The name of the left table.

    • rightTable: string

      The name of the right table.

    • options: {
          outputTable?: string;
      } = {}

      An optional object with configuration options:

      • Optional outputTable?: string

        The name of the table that will be created or replaced with the result of the cross join.

    Returns Promise<void>

  • Merges the data of two tables based on a common column. With SimpleNodeDB, it might create a .tmp folder, so make sure to add .tmp to your gitignore.

    // By default, the method automatically looks for a common column in the two tables and does a left join of tableA (left) and tableB (right). The leftTable (tableA here) will be overwritten with the result.
    await sdb.join("tableA", "tableB")

    // You can change the common column, the join type, and the output table in options.
    await sdb.join("tableA", "tableB", { commonColumn: 'id', type: 'inner', outputTable: 'tableC' })

    Parameters

    • leftTable: string

      The name of the left table to be joined.

    • rightTable: string

      The name of the right table to be joined.

    • options: {
          commonColumn?: string;
          outputTable?: string;
          type?: "inner" | "left" | "right" | "full";
      } = {}

      An optional object with configuration options:

      • Optional commonColumn?: string

        The common column used for the join operation. By default, the method automatically searches for a column name that exists in both tables.

      • Optional outputTable?: string

        The name of the new table that will store the result of the join operation. Default is the leftTable.

      • Optional type?: "inner" | "left" | "right" | "full"

        The type of join operation to perform. Possible values are "inner", "left", "right", or "full". Default is "left".

    Returns Promise<void>

  • Restructures a table by stacking values. Useful to tidy up data.

    As an example, let's use this as tableA. Let's say it shows the number of employees per year in different departments.

    Department 2021 2022 2023
    Accounting 10 9 15
    Sales 52 75 98

    We restructure it by putting all years into a column Year and the employees counts into a column Employees.

    await sdb.longer("tableA", ["2021", "2022", "2023"], "year", "employees")
    

    Now, the table looks like this and is longer.

    Department Year Employees
    Accounting 2021 10
    Accounting 2022 9
    Accounting 2023 15
    Sales 2021 52
    Sales 2022 75
    Sales 2023 98

    Parameters

    • table: string

      The name of the table to be restructured.

    • columns: string[]

      The column names (and associated values) that we want to stack.

    • columnsTo: string

      The new column in which the stacked columns' names will be put into.

    • valuesTo: string

      The new column in which the stacked columns' values will be put into.

    Returns Promise<void>

  • Removes one or more columns from a table.

    await sdb.removeColumns("tableA", ["column1", "column2"])
    

    Parameters

    • table: string

      The name of the table from which columns will be removed.

    • columns: string | string[]

      The name or an array of names of the columns to be removed.

    Returns Promise<void>

  • Removes one or more tables from the database.

    await sdb.removeTables(["table1", "table2"])
    

    Parameters

    • tables: string | string[]

      The name or an array of names of the tables to be removed.

    Returns Promise<void>

  • Renames columns in a specified table.

    // Renaming "How old?" to "age" and "Man or woman?" to "sex" in tableA.
    await sdb.renameColumns("tableA", {"How old?" : "age", "Man or woman?": "sex"})

    Parameters

    • table: string

      The table in which columns will be renamed.

    • names: {
          [key: string]: string;
      }

      An object mapping old column names to new column names.

      • [key: string]: string

    Returns Promise<void>

  • Restructures a table by unstacking values.

    As an example, let's use this as tableA. Let's say it shows the number of employees per year in different departments.

    Department Year Employees
    Accounting 2021 10
    Accounting 2022 9
    Accounting 2023 15
    Sales 2021 52
    Sales 2022 75
    Sales 2023 98

    We restructure it by making a new column for each year and with the associated employees counts as values.

    await sdb.longer("tableA", "Year", "Employees")
    

    Now, the table looks like this and is wider.

    Department 2021 2022 2023
    Accounting 10 9 15
    Sales 52 75 98

    Parameters

    • table: string

      The name of the table to be restructured.

    • columnsFrom: string

      The column containing the values that will be transformed into columns.

    • valuesFrom: string

      The column containing values to be spread across the new columns.

    Returns Promise<void>

Selecting or filtering data

  • Filters rows from a table based on SQL conditions. Note that it's often faster to use the removeRows method.

    // In table store, keep only rows where the fruit is not an apple.
    await sdb.filter("store", "fruit != 'apple'")

    // More examples:
    await sdb.filter("store", "price > 100 AND quantity > 0")
    await sdb.filter("inventory", "category = 'Electronics' OR category = 'Appliances'")
    await sdb.filter("customers", "lastPurchaseDate >= '2023-01-01'")

    Parameters

    • table: string

      The name of the table from which rows will be filtered.

    • conditions: string

      The filtering conditions specified as a SQL WHERE clause.

    Returns Promise<void>

  • Keeps rows with specific values in specific columns.

    // In table employees, keep only rows where the job is accountant or developer and where the city is Montreal.
    await sdb.keep("employees", { job: [ "accountant", "developer" ], city: [ "Montreal" ] })

    Parameters

    • table: string

      The name of the table

    • columnsAndValues: {
          [key: string]: unknown[];
      }

      An object with the columns (keys) and the values to be kept (values as arrays)

      • [key: string]: unknown[]

    Returns Promise<void>

  • Remove rows with specific values in specific columns.

    // In table employees, remove rows where the job is accountant or developer and where the city is Montreal.
    await sdb.remove("employees", { job: [ "accountant", "developer" ], city: [ "Montreal" ] })

    Parameters

    • table: string

      The name of the table

    • columnsAndValues: {
          [key: string]: unknown[];
      }

      An object with the columns (keys) and the values to be removed (values as arrays)

      • [key: string]: unknown[]

    Returns Promise<void>

  • Removes duplicate rows from a table, keeping unique rows. Note that SQL does not guarantee any specific order when using DISTINCT. So the data might be returned in a different order than the original.

    await sdb.removeDuplicates("tableA")
    

    Parameters

    • table: string

      The name of the table from which duplicates will be removed.

    • options: {
          on?: string | string[];
      } = {}

      An optional object with configuration options:

      • Optional on?: string | string[]

        A column or multiple columns to consider to remove duplicates. The other columns in the table will not be considered to exclude duplicates.

    Returns Promise<void>

  • Removes rows with missing values from a table. By default, missing values are NULL (as an SQL value), but also "NULL", "null", "NaN" and "undefined" that might have been converted to strings before being loaded into the table. Empty strings "" are also considered missing values.

    // Removes rows with missing values in any columns.
    await sdb.removeMissing("tableA")

    // Removes rows with missing values in specific columns.
    await sdb.removeMissing("tableA", { columns: ["firstName", "lastName"]})

    Parameters

    • table: string

      The name of the table from which rows with missing values will be removed.

    • options: {
          columns?: string | string[];
          invert?: boolean;
          missingValues?: (string | number)[];
      } = {}

      An optional object with configuration options:

      • Optional columns?: string | string[]

        Either a string or an array of strings specifying the columns to consider for missing values. By default, all columns are considered.

      • Optional invert?: boolean

        A boolean indicating whether to invert the condition, keeping only rows with missing values. Defaults to false.

      • Optional missingValues?: (string | number)[]

        An array of values to be treated as missing values. Defaults to ["undefined", "NaN", "null", "NULL", ""].

    Returns Promise<void>

  • Removes rows from a table based on SQL conditions.

    // In table store, remove rows where the fruit is an apple.
    await sdb.removeRows("store", "fruit = 'apple'")

    Parameters

    • table: string

      The name of the table from which rows will be removed.

    • conditions: string

      The filtering conditions specified as a SQL WHERE clause.

    Returns Promise<void>

  • Selects random rows from a table and removes the others.

    // Selects 100 random rows in tableA
    await sdb.sample("tableA", 100)

    // Selects 10% of the rows randomly in tableB
    await sdb.sample("tableB", "10%")

    Parameters

    • table: string

      The name of the table from which rows will be sampled.

    • quantity: string | number

      The number of rows (1000 for example) or a string ("10%" for example) specifying the sampling size.

    • options: {
          seed?: number;
      } = {}

      An optional object with configuration options:

      • Optional seed?: number

        A number specifying the seed for repeatable sampling. For example, setting it to 1 will ensure random rows will be the same each time you run the method.

    Returns Promise<void>

  • Selects specific columns in a table and removes the others.

    // Selecting only the columns firstName and lastName from tableA. All other columns in the table will be removed.
    await sdb.selectColumns("tableA", ["firstName", "lastName"])

    Parameters

    • table: string

      The name of the table from which columns will be selected.

    • columns: string | string[]

      Either a string (one column) or an array of strings (multiple columns) representing the columns to be selected.

    Returns Promise<void>

  • Selects n rows from a table. An offset and outputTable options are available.

    // Selects the first 100 rows from tableA.
    await sdb.selectRows("tableA", 100)

    // Selects 100 rows from tableA, after skipping the first 100 rows.
    await sdb.selectRows("tableA", 100, {offset: 100})

    // Selects 100 rows from tableA and stores them in tableB.
    await sdb.selectRows("tableA", 100, {outputTable: "tableB"})

    Parameters

    • table: string

      The name of the table from which rows will be selected.

    • count: string | number

      The number of rows.

    • options: {
          offset?: number;
          outputTable?: string;
      } = {}

      An optional object with configuration options:

      • Optional offset?: number

        The number of rows to skip before selecting. Defaults to 0.

      • Optional outputTable?: string

        The name of the table that will be created or replaced and where the new rows will be stored. By default, the original table is overwritten.

    Returns Promise<void>

Updating data

  • A method to perform computations on small batches instead of all of the data at once. Useful with big join operations.

    // The computation we want to run. A third parameter outputTable is optional.
    const run = async (sdb: SimpleDB, originalTable: string) => {
    await sdb.convert(
    originalTable,
    { salary: "number" },
    )
    await sdb.addColumn(
    originalTable,
    "salaryMultipliedBy2",
    "number",
    `Salary * 2`
    )
    }
    // Running it batch after batch.
    await simpleNodeDB.batch(run, "tableA", {
    batchSize: 5,
    })

    Parameters

    • run: ((simpleDB, originalTable, outputTable?) => Promise<void>)

      The function to be executed in batches.

        • (simpleDB, originalTable, outputTable?): Promise<void>
        • Parameters

          • simpleDB: SimpleNodeDB
          • originalTable: string
          • Optional outputTable: string

          Returns Promise<void>

    • originalTable: string

      The name of the original table to be processed.

    • options: {
          batchSize?: number;
          logBatchNumber?: boolean;
          outputTable?: string;
      } = {}

      An optional object with configuration options:

      • Optional batchSize?: number

        The number of items to process in each batch. Default is 10.

      • Optional logBatchNumber?: boolean

        A boolean indicating whether to log the batch number. Default is false.

      • Optional outputTable?: string

        The name of the output table where results will be stored. By default, the original table will be overwritten.

    Returns Promise<void>

  • Concatenates values from specified columns into a new column in a table.

    // Concatenates values from column1 and column2 into column3
    await sdb.concatenate("tableA", ["column1", "column2"], "column3")

    // Same thing, but the values will be separated by a dash
    await sdb.concatenate("tableA", ["column1", "column2"], "column3", {separator: "-"})

    Parameters

    • table: string

      The name of the table where concatenation will occur.

    • columns: string[]

      An array of column names from which values will be concatenated.

    • newColumn: string

      The name of the new column to store the concatenated values.

    • options: {
          separator?: string;
      } = {}

      An optional object with configuration options:

      • Optional separator?: string

        The string used to separate concatenated values. Defaults to an empty string.

    Returns Promise<void>

  • Extracts a specific number of characters, starting from the left.

    // Strings in column1 of tableA will be replaced by the first two characters of each string.
    await sdb.left("tableA", "column1", 2)

    Parameters

    • table: string

      The name of the table

    • column: string

      The name of the column storing the strings

    • numberOfCharacters: number

      The number of characters, starting from the left

    Returns Promise<void>

  • Formats strings to lowercase.

    // Just in one column.
    await sdb.lower("tableA", "column1")

    // In multiple columns
    await sdb.lower("tableA", ["column1", "column2"])

    Parameters

    • table: string

      The name of the table in which strings will be formatted to lowercase.

    • columns: string | string[]

      Either a string or an array of strings specifying the columns to be updated.

    Returns Promise<void>

  • Replaces specified strings in the selected columns of a table.

    // Replaces entire strings and substrings too.
    await sdb.replace("tableA", "column1", {"kilograms": "kg", liters: "l" })

    // Replaces only if matching entire string.
    await sdb.replace("tableA", "column1", {"kilograms": "kg", liters: "l" }, {entireString: true})

    // Replaces using a regular expression. Any sequence of one or more digits would be replaced by a hyphen.
    await sdb.replace("tableA", "column1", {"\d+": "-" }, {regex: true})

    Parameters

    • table: string

      The name of the table in which strings will be replaced.

    • columns: string | string[]

      Either a string or an array of strings specifying the columns where string replacements will occur.

    • strings: {
          [key: string]: string;
      }

      An object mapping old strings to new strings.

      • [key: string]: string
    • options: {
          entireString?: boolean;
          regex?: boolean;
      } = {}

      An optional object with configuration options:

      • Optional entireString?: boolean

        A boolean indicating whether the entire string must match for replacement. Defaults to false.

      • Optional regex?: boolean

        A boolean indicating the use of regular expressions for a global replace. See the RE2 docs for the syntax. Defaults to false.

    Returns Promise<void>

  • Replaces null values in the selected columns of a table.

    // Replace null values by 0.
    await sdb.replaceNulls("tableA", "column1", 0)

    Parameters

    • table: string

      The name of the table in which strings will be replaced.

    • columns: string | string[]

      Either a string or an array of strings specifying the columns where string replacements will occur.

    • value: string | number | boolean | Date

      The value to replace the null values.

    Returns Promise<void>

  • Extracts a specific number of characters, starting from the right.

    // Strings in column1 of tableA will be replaced by the last two characters of each string.
    await sdb.right("tableA", "column1", 2)

    Parameters

    • table: string

      The name of the table

    • column: string

      The name of the column storing the strings

    • numberOfCharacters: number

      The number of characters, starting from the right

    Returns Promise<void>

  • Rounds numeric values in specified columns of a table.

    // Rounds column1's values to the nearest integer.
    await sdb.round("tableA", "column1")

    // Rounds column1's values with a specific number of decimal places.
    await sdb.round("tableA", "column1", {decimals: 2})

    // Rounds column1's values with a specific method. Available methods are "round", "floor" and "ceiling".
    await sdb.round("tableA", "column1", {method: "floor"})

    Parameters

    • table: string

      The name of the table where numeric values will be rounded.

    • columns: string | string[]

      Either a string or an array of strings specifying the columns containing numeric values to be rounded.

    • options: {
          decimals?: number;
          method?: "round" | "ceiling" | "floor";
      } = {}

      An optional object with configuration options:

      • Optional decimals?: number

        The number of decimal places to round to. Defaults to 0.

      • Optional method?: "round" | "ceiling" | "floor"

        The rounding method to use. Defaults to "round".

    Returns Promise<void>

  • Sorts the rows of a table based on specified column(s) and order(s).

    // Sorts column1 ascendingly then column2 descendingly.
    await sdb.sort("tableA", {column1: "asc", column2: "desc"})

    // Same thing but taking French accent into account.
    await sdb.sort("tableA", {column1: "asc", column2: "desc"}, {lang: {column1: "fr"}})

    Parameters

    • table: string

      The name of the table to sort.

    • order: {
          [key: string]: "asc" | "desc";
      }

      An object mapping column names to the sorting order: "asc" for ascending or "desc" for descending.

      • [key: string]: "asc" | "desc"
    • options: {
          lang?: {
              [key: string]: string;
          };
      } = {}

      An optional object with configuration options:

    Returns Promise<void>

  • Splits strings along a separator and replaces the values with a substring at a specified index (starting at 0). If the index is outside the bounds of the list, return an empty string.

    // Splits on commas and replaces values with the second substring.
    await sdb.splitExtract("tableA", "column1", ",", 1)

    Parameters

    • table: string

      The name of the table

    • column: string

      The name of the column storing the strings

    • separator: string

      The substring to use as a separator

    • index: number

      The index of the substring to replace values

    Returns Promise<void>

  • Trims specified characters from the beginning, end, or both sides of string values.

    // Trims values in column1
    await sdb.trim("tableA", "column1")

    // Trims values in column2, columns3, and column4
    await sdb.trim("tableA", ["column2", "column3", "column4"])

    Parameters

    • table: string

      The name of the table.

    • columns: string | string[]

      The column or columns to trim.

    • options: {
          character?: string;
          method?: "trim" | "leftTrim" | "rightTrim";
      } = {}

      An optional object with configuration options:

      • Optional character?: string

        The string to trim. Defaults to whitespace.

      • Optional method?: "trim" | "leftTrim" | "rightTrim"

        The trimming method.

    Returns Promise<void>

  • Updates values in a specified column in a given table.

    await sdb.updateColumn("tableA", "column1", `LEFT(column2)`)
    

    Parameters

    • table: string

      The name of the table.

    • column: string

      The name of the column to be updated.

    • definition: string

      The SQL expression to set the new values in the column.

    Returns Promise<void>

  • Updates data in a table using a JavaScript function. The function takes the existing rows as an array of objects and must return them modified as an array of objects. This method provides a flexible way to update data, but it's slow.

    // Adds one to the values from column1 in tableA. If the values are not numbers, they are replaced by null.
    await sdb.updateWithJS("tableA", (rows) => {
    const modifiedRows = rows.map(d => ({
    ...d,
    column1: typeof d.column1 === "number" ? d.column1 + 1 : null
    }))
    return modifiedRows
    })

    Parameters

    • table: string

      The name of the table to update.

    • dataModifier: ((rows) => {
          [key: string]: number | string | Date | boolean | null;
      }[])

      A function that takes the existing rows and returns modified rows using JavaScript logic. The original rows are objects in an array and the modified rows must be returned as an array of objects too.

        • (rows): {
              [key: string]: number | string | Date | boolean | null;
          }[]
        • Parameters

          • rows: {
                [key: string]: number | string | Date | boolean | null;
            }[]

          Returns {
              [key: string]: number | string | Date | boolean | null;
          }[]

    Returns Promise<void>

  • Formats strings to uppercase.

    // Just in one column.
    await sdb.upper("tableA", "column1")

    // In multiple columns
    await sdb.upper("tableA", ["column1", "column2"])

    Parameters

    • table: string

      The name of the table in which strings will be formatted to uppercase.

    • columns: string | string[]

      Either a string or an array of strings specifying the columns to be updated.

    Returns Promise<void>

Other

  • Clones an existing table by creating or replacing a table with the same structure and data. The data can be optionally filtered. This can be very slow with big tables.

    // tableA data is cloned into tableB.
    await sdb.cloneTable("tableA", "tableB")

    // tableA data is cloned into tableB. Only rows with values greater than 10 in column1 are kept.
    await sdb.cloneTable("tableA", "tableB", {condition: `column1 > 10`})

    Parameters

    • originalTable: string

      The name of the table to be cloned.

    • newTable: string

      The name of the new table that will be created as a clone.

    • options: {
          condition?: string;
      } = {}

      An optional object with configuration options:

      • Optional condition?: string

        A SQL WHERE clause condition to filter the data. Defaults to no condition.

    Returns Promise<void>

  • Executes a custom SQL query, providing flexibility for advanced users.

    // You can use the returnDataFrom option to retrieve the data from the query, if needed.
    await sdb.customQuery( "SELECT * FROM employees WHERE Job = 'Clerk'", {returnDataFrom: "query"})

    Parameters

    • query: string

      The custom SQL query to be executed.

    • options: {
          returnDataFrom?: "query" | "none";
          table?: string;
      } = {}

      An optional object with configuration options:

      • Optional returnDataFrom?: "query" | "none"

        Specifies whether to return data from the "query" or not. Defaults to "none".

      • Optional table?: string

        The name of the table associated with the query (if applicable). Needed when debug is true.

    Returns Promise<null | {
        [key: string]: string | number | boolean | Date | null;
    }[]>

  • Return the list of column names for a specified table.

    const columns = await sdb.getColumns("dataCsv")
    

    Parameters

    • table: string

      The name of the table for which to retrieve column names.

    Returns Promise<string[]>

  • Returns descriptive information about the columns of a specified table, including details like data types, number of null and distinct values. Best to look at with console.table.

    const description = await sdb.getDescription("tableA")
    

    Parameters

    • table: string

      The name of the table.

    Returns Promise<{
        [key: string]: unknown;
    }[]>

  • Returns the DuckDB extensions.

    const extensions = await sdb.getExtensions()
    

    Returns Promise<{
        [key: string]: string | number | boolean | Date | null;
    }[]>

  • Returns the number of rows (length) in a table.

    const nbRows = await sdb.getLength("tableA")
    

    Parameters

    • table: string

      The name of the table.

    Returns Promise<number>

  • Returns the schema (column names and their data types) of a specified table.

    const schema = await sdb.getSchema("tableA")
    

    Parameters

    • table: string

      The name of the table for which to retrieve the schema.

    Returns Promise<{
        [key: string]: string | null;
    }[]>

  • Returns the data types of columns in a table.

    const dataTypes = await sdb.getTypes("tableA")
    

    Parameters

    • table: string

      The name of the table.

    Returns Promise<{
        [key: string]: string;
    }>

  • Returns the number of data points (cells/values) in a table.

    const nbDataPoints = await sdb.getValuesCount("tableA")
    

    Parameters

    • table: string

      The name of the table .

    Returns Promise<number>

  • Returns the number of columns (width) in a table.

    const nbColumns = await sdb.getWidth("tableA")
    

    Parameters

    • table: string

      The name of the table.

    Returns Promise<number>

  • Returns true if a specified column exists in a given table and false if not.

    const hasColumnSalary = await sdb.hasColumn("employees", "salary")
    

    Parameters

    • table: string

      The name of the table.

    • column: string

      The name of the column to check for existence.

    Returns Promise<boolean>

  • Returns true if a specified table exists and false if not.

    const hasEmployees = await sdb.hasTable("employees")
    

    Parameters

    • table: string

      The name of the table to check for existence.

    Returns Promise<boolean>

  • Logs a specified number of rows from a table. Default is 10 rows.

    // Logs first 10 rows
    await sdb.logTable("tableA");

    // Logs first 100 rows
    await sdb.logTable("tableA", {nbRowsToLog: 100});

    Parameters

    • table: string

      The name of the table.

    • options: {
          nbRowsToLog?: number;
      } = {}

      An optional object with configuration options:

      • Optional nbRowsToLog?: number

        The number of rows to log when debugging. Defaults to 10 or the value set in the SimpleDB instance.

    Returns Promise<void>

  • Renames an existing table.

    // tableA data is renamed tableB.
    await sdb.renameTable("tableA", "tableB")

    Parameters

    • originalTable: string

      The original name.

    • newTable: string

      The new name.

    Returns Promise<void>

  • Initializes DuckDB and establishes a connection to the database. It also optionnaly installs the spatial extension. It's called automatically with the first method you'll run.

    Returns Promise<void>