Skip to content

Document Client

Gunpal Jain edited this page Feb 16, 2025 · 2 revisions

Introduction

The DocumentsClient provides methods for interacting with the Gemini API's Documents endpoint. This allows you to create, manage, and query documents within a corpus. Documents are the individual units of content within a corpus, containing the text that will be used for semantic search.

Details

The DocumentsClient offers the following functionalities:

Creating a Document

The CreateDocumentAsync method creates a new document within a specified corpus.

using GenerativeAI.Clients;
using GenerativeAI.Types;

// ... other code ...

var documentsClient = new DocumentsClient(platform, httpClient, logger); // Initialize DocumentsClient

var parentCorpus = "corpora/my-corpus-id"; // Replace with the parent corpus name

var document = new Document
{
    DisplayName = "My Sample Document",
    CustomMetadata = new List<CustomMetadata>{new CustomMetadata(){Key="my key", StringValue = "This is a test document" }} 
    // ... other document properties ...
};

var createdDocument = await documentsClient.CreateDocumentAsync(parentCorpus, document);

if (createdDocument != null)
{
    Console.WriteLine($"Document created: {createdDocument.Name}");
}
else
{
    Console.WriteLine("Failed to create document.");
}

Querying a Document

The QueryDocumentAsync method performs semantic search within a specific document.

using GenerativeAI.Clients;
using GenerativeAI.Types;

// ... other code ...

var documentsClient = new DocumentsClient(platform, httpClient, logger); // Initialize DocumentsClient

var documentName = "corpora/my-corpus-id/documents/my-document-id"; // Replace with the document name

var queryDocumentRequest = new QueryDocumentRequest
{
    Query = "What is mentioned about topic X in this document?",
    // ... other query parameters ...
};

var queryDocumentResponse = await documentsClient.QueryDocumentAsync(documentName, queryDocumentRequest);

if (queryDocumentResponse != null && queryDocumentResponse.RelevantChunks != null)
{
    foreach (var chunk in queryDocumentResponse.RelevantChunks)
    {
        Console.WriteLine($"Relevant Chunk: {chunk.ChunkData.Text}");
    }
}
else
{
    Console.WriteLine("No relevant chunks found.");
}

Listing Documents

The ListDocumentsAsync method retrieves a list of documents within a corpus.

using GenerativeAI.Clients;
using GenerativeAI.Types;

// ... other code ...

var documentsClient = new DocumentsClient(platform, httpClient, logger); // Initialize DocumentsClient

var parentCorpus = "corpora/my-corpus-id"; // Replace with the parent corpus name

var listDocumentsResponse = await documentsClient.ListDocumentsAsync(parentCorpus); // You can provide pageSize and pageToken

if (listDocumentsResponse != null && listDocumentsResponse.Documents != null)
{
    foreach (var document in listDocumentsResponse.Documents)
    {
        Console.WriteLine($"Document Name: {document.Name}");
    }
}
else
{
    Console.WriteLine("No documents found.");
}

Getting a Document

The GetDocumentAsync method retrieves a specific document by name.

using GenerativeAI.Clients;
using GenerativeAI.Types;

// ... other code ...

var documentsClient = new DocumentsClient(platform, httpClient, logger); // Initialize DocumentsClient

var documentName = "corpora/my-corpus-id/documents/my-document-id"; // Replace with the document name

var document = await documentsClient.GetDocumentAsync(documentName);

if (document != null)
{
    Console.WriteLine($"Document Display Name: {document.DisplayName}");
    Console.WriteLine($"Document Content: {document.Content?.Text}");
}
else
{
    Console.WriteLine("Document not found.");
}

Updating a Document

The UpdateDocumentAsync method updates an existing document.

using GenerativeAI.Clients;
using GenerativeAI.Types;

// ... other code ...

var documentsClient = new DocumentsClient(platform, httpClient, logger); // Initialize DocumentsClient

var documentName = "corpora/my-corpus-id/documents/my-document-id"; // Replace with the document name

var updatedDocument = new Document
{
    Name = documentName, // Important: Include the name in the updated document object.
    DisplayName = "My Updated Document Name",
    CustomMetadata = new List<CustomMetadata>{new CustomMetadata(){Key="my key", StringValue = "This is a test document updated" }} # 
    // ... other updated properties ...
};

string updateMask = "displayName,content"; // Specify the fields to update

var resultDocument = await documentsClient.UpdateDocumentAsync(documentName, updatedDocument, updateMask);

if (resultDocument != null)
{
    Console.WriteLine($"Document updated: {resultDocument.DisplayName}");
}
else
{
    Console.WriteLine("Failed to update document.");
}

Deleting a Document

The DeleteDocumentAsync method deletes a document.

using GenerativeAI.Clients;

// ... other code ...

var documentsClient = new DocumentsClient(platform, httpClient, logger); // Initialize DocumentsClient

var documentName = "corpora/my-corpus-id/documents/my-document-id"; // Replace with the document name

await documentsClient.DeleteDocumentAsync(documentName); // You can optionally set force to true

Console.WriteLine($"Document deleted: {documentName}");

Important Considerations

  • Ensure proper authorization is configured before using the DocumentsClient. See the Authentication page.
  • Replace placeholder document names, IDs, and corpus names with actual values.
  • Handle potential exceptions during API calls.
  • Be mindful of rate limits when making frequent requests. See the official documentation for details.
  • The updateMask parameter in UpdateDocumentAsync is crucial. It specifies which fields of the Document object should be updated. Only the fields listed in the updateMask will be modified.

API Reference

Clone this wiki locally