Case: Detect existing files at upload & avoid AI reprocessing

Context

Whisperit currently has no mechanism to detect when a document already exists in the system, either at upload time or at AI processing time.

This is a root cause of our quota crisis: the user delete entire folders (30-60 documents) and reprocess everything from scratch because they have no visibility on what is already indexed. Direct consequence: runaway upload + Azure processing costs, and the drastic quota measure taken two weeks ago.

Scenario: a lawyer builds an AI context on ~100 case files, generates a summary, returns 20 days later to redo the task. In the meantime, 10-30 new files have arrived in their SharePoint. Today, the only safe option is to delete the whole folder and re-upload everything.

Scope

Two layers, shipped together:

  1. Upload layer — duplicate detection

    • On upload, detect duplicates based on name + size + content hash.

    • Block silent duplication; surface explicit user choice: skip / replace / keep both (auto-rename).

    • Apply at single-file and bulk-upload entry points.

  2. AI layer — incremental context awareness

    • When the user adds files to an existing folder/context, the AI must recognize which documents are already in its context.

    • Prompt the user: "I already have files X, Y, Z in this context. Integrate only the new ones (A, B, C) or rebuild from scratch?"

    • Process only deltas by default; full reprocess on explicit request.

Acceptance criteria

  • Uploading a file already present in the folder triggers a duplicate dialog (skip / replace / keep both).

  • Hash-based detection works even if the file is renamed.

  • Adding files to a folder with an existing AI context only processes the new files unless the user opts for a full rebuild.

  • Telemetry: track duplicate-detection hits and reprocess-avoided events to measure cost impact.

  • Regression: existing single-file workflows are unaffected.

Please authenticate to join the conversation.

Upvoters
Status

In Review

Board

Whisperit Roadmap

Date

3 days ago

Subscribe to post

Get notified by email when there are changes.