How to extract data from bank statements with the Sensible Node SDK

Updated on
December 19, 2023
5
min read
Contributors
No items found.
Author
How to extract data from bank statements with the Sensible Node SDK
Table of contents
Turn documents into structured data
Node.js document extraction for developers
Get started free
Share this post

Use Sensible’s Node SDK to directly integrate Sensible’s document extraction and classification APIs into your product. With the Sensible SDK you can easily extract key information from your custom documents, powered by Sensible’s developer platform.

What we'll cover

This blog post briefly walks you through installing the Sensible Node SDK and integrating it into an example application. You’ll extract data from a directory of documents, view the data as JSON, and convert the data to Excel. To follow along with this tutorial, you'll need:

- Node installed (any version that’s not end-of-life)

- Basic knowledge of Node

- A free Sensible account

- A code editor, i.e., VS Code

Example app scenario

Imagine you’re a startup that helps homeowners connect to lenders and government subsidies for energy-efficient home improvements. As part of qualifying your potential borrowers, you ask them to upload 3 recent bank statements. You need to extract data from the bank statements and get it into a backend database as part of your business logic.

Tutorial

Take the following steps to extract sample data from bank statements with the Sensible Node SDK.

1. Get a Sensible account

To follow this tutorial, sign up for a free Sensible account, then create and make note of your API key.

2. Install the Sensible SDK

In an environment with Node installed, open a command prompt and enter the following commands to create a test project:

mkdir sensible-test
cd sensible-test
touch index.mjs

Then, install the SDK:

npm install sensible-api

3. Download example bank statements

To extract real document data in this tutorial, you’ll use a collection of example bank statements. In your test project, open a command prompt and enter the following commands to download and unzip the sample documents:

wget https://github.com/sensible-hq/sensible-docs/raw/main/readme-sync/assets/v0/pdfs/blog_bank_statements.zip && unzip blog_bank_statements.zip

4. Configure extractions for bank statements

To add extraction configurations for bank statements to your Sensible account,  follow the steps in Out-of-the-box extractions.

Explanation: For this tutorial, we’ll skip writing extraction configurations from scratch, and use Sensible’s pre-built, open-source configuration library. This step adds a bank_statements document type in the Sensible app from the Sensible configuration library. The document type is an API endpoint, which you’ll call later in this tutorial. It contains a collection of SenseML extraction configurations, one for each of your target banks, for example Bank of America, Chase, and so forth.  

5. Extract document data

See the following code for a working example of how to use the SDK for document extraction in your own app.

The example:

  1. Filters a directory to find PDF files.
  2. Extracts data from the PDF files using the extraction configurations in a bank_statements document type.
  3. Prints extraction results to the console.
  4. Compiles the extractions into an Excel file. 
import { promises as fs } from "fs";
import { SensibleSDK } from "sensible-api";
import got from "got";
const apiKey = process.env.SENSIBLE_APIKEY;
const sensible = new SensibleSDK(apiKey);




// find all the PDFS in your unzipped directory
const dir = "./bank_statements";
const files = (await fs.readdir(dir)).filter((file) => file.match(/\.pdf$/));


const extractions = await Promise.all(
  files.map(async (filename) => {
    const path = `${dir}/${filename}`;
    // use the bank statements endpoint to extract data from the PDFs
    return sensible.extract({
      path,
      documentType: "bank_statements",
      documentName: filename
    });
  })
);


const results = await Promise.all(
  extractions.map((extraction) => sensible.waitFor(extraction))
);


console.log(results);


const excel = await sensible.generateExcel(extractions);
console.log("Excel download URL:");
console.log(excel);
const excelFile = await got(excel.url);
await fs.writeFile(`${dir}/output.xlsx`, excelFile.rawBody);

To run the preceding code example, take the following steps:

1. Paste the preceding code into your empty index.mjs file.

2. Configure your environment so that your SENSIBLE_APIKEY  specifies your API key.

3. In a command prompt in the same directory as your index.mjs file, run the code with the following command:

node index.mjs

6. Check results

You should see results like the following excerpted response, which show a couple of extraction results.

Note: You'll notice several fields are null, because the example documents are redacted.

[
  {
    id: '433441d8-101c-48f8-a108-14d644e81f8a',
    created: '2023-12-18T17:45:12.472Z',
    completed: '2023-12-18T17:45:14.422Z',
    status: 'COMPLETE',
    type: 'bank_statements',
    document_name: 'wells_fargo_checking_sample.pdf',
    configuration: 'wells_fargo_checking',
    environment: 'production',
    page_count: 5,
    parsed_document: {
      start_date: [Object],
      end_date: [Object],
      customer_name: [Object],
      customer_address: [Object],
      account_summary_table: null,
      accounts: [Array]
    },
    validations: [],
    validation_summary: {
      fields: 6,
      fields_present: 5,
      errors: 0,
      warnings: 0,
      skipped: 0
    },
    classification_summary: [ [Object], [Object], [Object], [Object], [Object] ],
    errors: [],
    download_url: 'https://sensible-so-document-type-bucket-prod-us-west-2.s3.us-west-2.amazonaws.com/17c43ab5-e892-43eb-b72b-48be95ee3095/4fb67465-4752-4645-bea6-74a867433aee/EXTRACTION/REDACTED,
    file_metadata: { info: [Object], metadata: [Object] },
    coverage: 0.7857142857142857
   },
   {
    id: '387cd6b8-d28a-4495-8e68-18bd6d88a9a7',
    created: '2023-12-18T17:16:08.759Z',
    completed: '2023-12-18T17:16:11.658Z',
    status: 'COMPLETE',
    type: 'bank_statements',
    document_name: 'boa_sample.pdf',
    configuration: 'boa',
    environment: 'production',
    page_count: 6,
    parsed_document: {
      start_date: [Object],
      end_date: [Object],
      customer_name: null,
      customer_address: null,
      account_summary_table: null,
      accounts: [Array]
    },
    validations: [],
    validation_summary: {
      fields: 6,
      fields_present: 3,
      errors: 0,
      warnings: 0,
      skipped: 0
    },
    classification_summary: [ [Object], [Object], [Object], [Object], [Object] ],
    errors: [],
    download_url: 'https://sensible-so-document-type-bucket-prod-us-west-2.s3.us-west-2.amazonaws.com/17c43ab5-e892-43eb-b72b-48be95ee3095/REDACTED',
    file_metadata: { info: [Object] },
    coverage: 0.6428571428571429
  }
]
Excel download URL:
{
  url: 'https://sensible-so-document-type-bucket-prod-us-west-2.s3.us-west-2.amazonaws.com/17c43ab5-e892-43eb-b72b-48be95ee3095/REDACTED'
}

Note To access data in the parsed_document object, for example to see details of transactions in the Accounts object, you can add statements similar to the following to your script:

results.forEach(results => {
  const startDate = results.parsed_document.start_date;
  const accounts = results.parsed_document.accounts;
  console.log(`Statement start date: ${JSON.stringify(startDate)}`);
  console.log(`Accounts: ${JSON.stringify(accounts, null, 2)}`);
  console.log('\n'); // Separate each blob's output for better readability
});

If you add the preceding code, you should see results like the following logged to the console for each extraction:

Statement start date: 
{
  "source": "January 14, 2021",
  "value": "2021-01-14T00:00:00.000Z",
  "type": "date"
}

Accounts: 

[
  {
    "account_type": {
      "type": "string",
      "value": "CHECKING"
    },
    "account_number": {
      "type": "string",
      "value": "0000"
    },
    "beginning_balance": {
      "source": "2,967.65",
      "value": 2967.65,
      "unit": "$",
      "type": "currency"
    },
    "deposits": {
      "source": "6,411.44",
      "value": 6411.44,
      "unit": "$",
      "type": "currency"
    },
    "withdrawals": {
      "source": "6,320.35",
      "value": 6320.35,
      "unit": "$",
      "type": "currency"
    },
    "end_balance": {
      "source": "3,002.41",
      "value": 3002.41,
      "unit": "$",
      "type": "currency"
    },
    "deposits_table": [
      {
        "date": {
          "value": "01/15",
          "type": "string"
        },
        "description": null,
        "amount": {
          "source": "684.18",
          "value": 684.18,
          "unit": "$",
          "type": "currency"
        }
      },
      {
        "date": {
          "value": "01/20",
          "type": "string"
        },
        "description": null,
        "amount": {
          "source": "175.00",
          "value": 175,
          "unit": "$",
          "type": "currency"
        }
      },
      {
        "date": {
          "value": "01/22",
          "type": "string"
        },
        "description": null,
        "amount": {
          "source": "807.93",
          "value": 807.93,
          "unit": "$",
          "type": "currency"
        }
      },
      {
        "date": {
          "value": "01/29",
          "type": "string"
        },
        "description": null,
        "amount": {
          "source": "1,625.94",
          "value": 1625.94,
          "unit": "$",
          "type": "currency"
        }
      },
      {
        "date": {
          "value": "01/29",
          "type": "string"
        },
        "description": null,
        "amount": {
          "source": "684.18",
          "value": 684.18,
          "unit": "$",
          "type": "currency"
        }
      },
      {
        "date": {
          "value": "02/05",
          "type": "string"
        },
        "description": null,
        "amount": {
          "source": "609.18",
          "value": 609.18,
          "unit": "$",
          "type": "currency"
        }
      },
      {
        "date": {
          "value": "02/08",
          "type": "string"
        },
        "description": null,
        "amount": {
          "source": "200.00",
          "value": 200,
          "unit": "$",
          "type": "currency"
        }
      },
      {
        "date": {
          "value": "02/09",
          "type": "string"
        },
        "description": {
          "value": "1",
          "type": "string"
        },
        "amount": {
          "source": "1,625.00",
          "value": 1625,
          "unit": "$",
          "type": "currency"
        }
      },
      {
        "date": {
          "value": "02/10",
          "type": "string"
        },
        "description": null,
        "amount": {
          "source": "0.03",
          "value": 0.03,
          "unit": "$",
          "type": "currency"
        }
      }
    ],
    "withdrawals_table": [
      {
        "date": {
          "value": "01/15",
          "type": "string"
        },
        "description": null,
        "amount": {
          "source": "200.00",
          "value": 200,
          "unit": "$",
          "type": "currency"
        }
      },
      {
        "date": {
          "value": "01/19",
          "type": "string"
        },
        "description": null,
        "amount": {
          "source": "250.00",
          "value": 250,
          "unit": "$",
          "type": "currency"
        }
      },
      {
        "date": {
          "value": "01/25",
          "type": "string"
        },
        "description": null,
        "amount": {
          "source": "189.14",
          "value": 189.14,
          "unit": "$",
          "type": "currency"
        }
      },
      {
        "date": {
          "value": "02/02",
          "type": "string"
        },
        "description": null,
        "amount": {
          "source": "173.51",
          "value": 173.51,
          "unit": "$",
          "type": "currency"
        }
      },
      {
        "date": {
          "value": "02/02",
          "type": "string"
        },
        "description": null,
        "amount": {
          "source": "3,108.24",
          "value": 3108.24,
          "unit": "$",
          "type": "currency"
        }
      },
      {
        "date": {
          "value": "02/08",
          "type": "string"
        },
        "description": null,
        "amount": {
          "source": "899.46",
          "value": 899.46,
          "unit": "$",
          "type": "currency"
        }
      },
      {
        "date": {
          "value": "02/08",
          "type": "string"
        },
        "description": null,
        "amount": {
          "source": "1,500.00",
          "value": 1500,
          "unit": "$",
          "type": "currency"
        }
      }
    ],
    "transaction_table": null
  },
  {
    "account_type": {
      "type": "string",
      "value": "SAVINGS"
    },
    "account_number": {
      "type": "string",
      "value": "0000"
    },
    "beginning_balance": {
      "source": "51,752.63",
      "value": 51752.63,
      "unit": "$",
      "type": "currency"
    },
    "deposits": {
      "source": "250.79",
      "value": 250.79,
      "unit": "$",
      "type": "currency"
    },
    "withdrawals": null,
    "end_balance": {
      "source": "52,003.24",
      "value": 52003.24,
      "unit": "$",
      "type": "currency"
    },
    "deposits_table": null,
    "withdrawals_table": null,
    "transaction_table": [
      {
        "date": null,
        "description": {
          "value": "Beginning Balance",
          "type": "string"
        },
        "amount": {
          "source": "51,752.63",
          "value": 51752.63,
          "unit": "$",
          "type": "currency"
        }
      },
      {
        "date": {
          "value": "01/19",
          "type": "string"
        },
        "description": {
          "value": "250.00",
          "type": "string"
        },
        "amount": {
          "source": "52,002.63",
          "value": 52002.63,
          "unit": "$",
          "type": "currency"
        }
      },
      {
        "date": {
          "value": "02/10",
          "type": "string"
        },
        "description": {
          "value": "0.79 Interest Payment",
          "type": "string"
        },
        "amount": {
          "source": "52,003.42",
          "value": 52003.42,
          "unit": "$",
          "type": "currency"
        }
      },
      {
        "date": {
          "value": "02/10",
          "type": "string"
        },
        "description": {
          "value": "Federal Interest Withheld -0.18",
          "type": "string"
        },
        "amount": {
          "source": "52,003.24",
          "value": 52003.24,
          "unit": "$",
          "type": "currency"
        }
      }
    ]
  }
]

You can import the extracted document data as JSON into your database, or for a quick view of the data, you can take a look at the Excel file the script also generated.

Conclusion

Sensible’s SDK gives you convenient access to the powerful and flexible Sensible developer platform for intelligent document automation. Once you’ve configured your extractions in the Sensible app, integrating is as easy as a few lines of SDK code. Then you’ve got the structured and typed document data at your fingertips as JSON or Excel data.  

Check out our prebuilt bank statement config in our open-source library, check out our docs, and sign up for a free account (no credit card required) to start extracting data from your own documents.

Node.js document extraction for developers
Turn documents into structured data
Get started free
Share this post

Turn documents into structured data

Stop relying on manual data entry. With Sensible, claim back valuable time, your ops team will thank you, and you can deliver a superior user experience. It’s a win-win.