Skip to content
This repository was archived by the owner on Mar 4, 2026. It is now read-only.
This repository was archived by the owner on Mar 4, 2026. It is now read-only.

Sessions leaking from Pool #134

@DuaneGarber

Description

@DuaneGarber

We are experiencing an issue where sessions are not being constrained. During our stress testing we are making 4 requests in parallel all by primary keys, 50k times.

During initial test runs of 2000 records, the sessions count grew by 711. Effectively was we scaled it up in our sandbox we quickly ran out of all 10k sessions in minutes, locking the database.

The session count is being confirmed with gcloud spanner databases sessions list --database=users --instance=users-test | grep "name" | wc -l.

We looked at the previous pool leak fix and confirmed we were using that code but it did not solve the issue. (#91)

Environment details

  • OS: Mac OS high sierra
  • Node.js version: v8.9.0
  • npm version: v5.6.0
  • @google-cloud/spanner version: v1.1.0

Steps to reproduce

The queries are simple run requests, here is a code sample where we can reproduce.

This code is obviously poor practice, but we needed queries to take a longer than near immediate response to force error manifestation.

const spanner = require('@google-cloud/spanner');
const Promise = require('bluebird');
const crypto = require('crypto');

const database = spanner({ projectId: 'test-project' }).instance('test-instance').database('users',
  {
    acquireTimeout: 30000,
    keepAlive: 1,
    max: 10,
  }
);

function getOffset() {
  return Math.floor(Math.random() * 10000)
}

function getReadPromise() {
  return database.run({
    sql:
      'SELECT * FROM users limit 1 offset @offset',
    params: {
      offset: getOffset(),
    },
  });
}

function start() {
  console.log('Starting');
  const arr = new Array(5).fill(1);
  return Promise.map(arr, () => Promise.all([
    getReadPromise(),
    getReadPromise(),
    getReadPromise(),
  ]));
}

setTimeout(() => {
  start().then(start).then(start).then(start).then(start);
}, 3000);

The issue doesn’t manifest itself in version 0.8.2, which 0.8.3 is when the pooling was updated.

Environment details
OS: Mac OS High sierra
Node.js version: v8.9.0
npm version: v5.6.0
version: v1.1.0

We have a solution and will make the PR soon.

Issue Analysis

Here is our analysis of the issue after investigation:
In the following code, the code removes the session either from reads_ or writes_ array.

image

getNextAvailableSession_ method is called by getSession_ which is called by acquireSession_. getNextAvailableSession_ removes the session from reads_ or writes_ array and acquireSession_ adds the session to borrowed_ array as shown in the code below.

image

This code is asynchronous, which means it is creating a race condition. It may end up with a session being borrowed from reads or writes and it has also been yet to be marked as borrowed. This has the effect of not being counted towards the maximum amount of sessions available to the session. Which is ultimately the error issue causes excess sessions to be created and these sessions are not accounted for such that it will eventually consume all of the available sessions for the spanner instance.

Proposed Solution

We can fix the problem by making sure that when we remove a session from reads_ or writes_ array, we should mark the session as burrowed synchronously.

image

Metadata

Metadata

Assignees

Labels

🚨This issue needs some love.api: spannerIssues related to the googleapis/nodejs-spanner API.triage meI really want to be triaged.type: bugError or flaw in code with unintended results or allowing sub-optimal usage patterns.

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions