Why We Stopped Doing Deep Research Manually (And Built Parallect Instead)

A team wrestling with manual deep research on the left, a synthesized multi-provider pipeline on the right, and the Parallect gem glowing in the middle.

Deep research as a category works. We love what OpenAI, Gemini, Perplexity, Grok, and Anthropic ship when you point them at a real question. The reports are good.

Our problem wasn't the output. Our problem was the workflow around it.

This is the story of how our team at SecureCoders ended up running deep research by hand across two providers at a time, got tired of it, automated it for ourselves, and then realized we'd built a product.

The problem we were living with

The shape of deep research today is strange if you stop and look at it.

You type a question. You wait thirty to sixty minutes. You come back, if you remember to come back, to a long chat thread. You can read it. You can ask follow-up questions inside it. What you can't do is act on it without a lot of friction.

For us, that gap was too wide. We build with an AI agent sitting on top of our workflow. The agent triggers research, but the research finishes somewhere the agent can't really see. A teammate has to scroll back through the thread, copy the relevant parts, and paste them somewhere usable. The intelligence was in the report. The handoff wasn't.

A research report you have to babysit is not a research tool. It's homework.

The two-provider experiment

The first thing that improved our output, before any automation, was running the same query through more than one provider.

We started doing it by hand. Paste the question into OpenAI Deep Research. Paste the same question into Gemini Deep Research. Wait for both. Download both. Give both to our agent and let it synthesize across them.

The result was noticeably better than either one alone. Different sources. Different angles. Different blind spots cancelling each other out. It wasn't just "two reports"; the synthesis across them was better than either report in isolation.

The catch: somebody had to babysit the whole thing. Copy, paste, wait, download, hand off. Expensive in hours, not in dollars.

That awkward manual workflow (run in parallel, hand off, synthesize) is the thing that eventually became Parallect.

Research plus an agent is the actual unlock

This is the argument we keep coming back to.

When the agent is the thing that kicks off research, it already knows why. Which conversation the question came from. Which decision it's supporting. Which teammate is waiting. When the research finishes, the agent can bring the findings back to where the question originated and do something with them: update a doc, open a ticket, propose a change, flag a risk.

Research stopped being a noun we read and started being a step in a workflow.

Once we saw that, we couldn't go back to "wait an hour, remember to check the tab, copy the parts that mattered." That version of deep research feels broken once you've seen something better.

We automated it, and the scope kept expanding

The first version of Parallect was, honestly, a script. Same two providers. No babysitting. Agent-friendly output format.

It worked immediately. And the moment we could see the shape of the automated version, the scope expanded on its own:

Add Perplexity. Add Grok. Add Claude.
Auto-synthesize across all of them instead of handing raw reports to the agent.
Cross-check claims between providers to surface contradictions and likely hallucinations.
Tune the prompt per provider, because each one responds to framing differently.
Structure the output (claims, citations, provider breakdowns) so an agent can actually consume it.

Somewhere around the fourth provider, we noticed we weren't building an internal tool anymore. The workflow we'd been running by hand, and the workflow we were now scripting, was one other teams had quietly been doing, too. We'd asked. They had the same problem. They just hadn't automated it yet.

So we kept going.

Five overlapping glowing fields of view on a dark grid, each a different color, with most bright dots sitting inside single fields and only a few in the overlapping regions.

This one we quantified the hard way. We ran 90 research queries through 5 providers and 8 model variants. We extracted every factual claim and asked a simple question: how many claims does each provider find that no other provider finds?

The answer: 86%. Full writeup →

86%

of factual findings across 90 research queries were surfaced by only one provider. Not phrasing differences. Literally different sources, different claims.

The reason divergence is that high, it turns out, is upstream: half of the source domains each provider cites are exclusive to that provider. Only about 1% of domains are cited by all of them. Providers aren't just saying the same things differently. They're reading different websites.

This is the argument for multi-provider coverage, stated in numbers. Your intuition says "the best model will find the best stuff." The data says the best model finds the stuff it finds. Union beats intersection, consistently, across every topic we tested.

We don't want to guess which provider is strongest for a given query. Neither do our users. Parallect averages across them, so you get full coverage without having to know in advance whether this specific question plays better on OpenAI or Grok.

Query crafting is a feature

One thing we underestimated and now think about constantly: a lot of people don't run deep research because they don't know how to write a research query.

They have a half-formed question. They're not sure how to scope it. They're not sure what to ask for. They hesitate. They don't run it.

Parallect handles query crafting. You can start with two sentences, the minimum viable version of your question, and it produces a well-formed research prompt with explicit scope, dimensions, and deliverables. You review it, edit it, or discard it. (How refinement works →)

This sounds minor. It isn't. It's the difference between deep research being a tool for people who already know how to use it and a tool everyone on the team can use. When we shipped query crafting internally, research volume went up sharply, from the people who'd been hesitating.

Where it's going: prxhub

A glowing octagonal hub at the center of a mesh of smaller research-bundle nodes connected by luminous lines, with particles flowing in and out.

The next layer we're building is a research caching layer. We call it prxhub.

The idea is simple. When research runs, the synthesis (claims, sources, provider breakdowns, contradictions) gets stored in a shared, queryable format. The next time someone asks a related question, the agent can check prxhub before reaching out to providers. If the answer's already there, it's free and instant. If it's partially there, we only fetch what's missing.

That's useful on its own. What makes it compelling is the network effect. The more research runs through the system, the better the coverage for everyone. An agent searching the web can pull from existing synthesis that's already been run, cross-checked, and verified by other agents on other teams.

We don't think of this as "caching." We think of it as shared memory for AI research.

Why we're publishing this

Every good product we've ever worked on started as an internal tool that got out of hand. Parallect is that story again.

We were doing something painful enough, often enough, that automating it was an obvious win. Once we automated it, the automation revealed a bigger problem than the one we started with. Once we looked at the bigger problem, we realized it wasn't ours alone.

If you're doing the thing we were doing, pasting the same query into two or three providers, waiting, downloading, reconciling by hand, stop. Let us do it.

If your question feels half-formed, let Parallect finish it. First research is on us.