Add LawAIGovHub MCP connector to ai-governance-legal and regulatory…#34
Add LawAIGovHub MCP connector to ai-governance-legal and regulatory…#34Brokemountain wants to merge 2 commits into
Conversation
…-legal
LawAI Gov Hub is the world's most complete index of official AI regulation.
Every entry links directly to its primary legal source — laws, executive
orders, court cases, and regulatory guidance from 240+ jurisdictions
worldwide. The connector exposes five read-only tools (search_regulations,
fetch_regulation, list_jurisdictions, search_cases, fetch_case) with
provenance (source URL, jurisdiction ISO, retrieved_at, citation) on every
result.
What's in the PR:
- .mcp.json entry in both ai-governance-legal and regulatory-legal
pointing to https://mcp.lawaigovhub.com/mcp/ (streamable HTTP).
- Description tightened: "Official AI regulation index — laws,
executive orders, court cases, and regulatory guidance from 240+
jurisdictions, each result linked to its primary legal source with
citation-ready identifiers and jurisdiction ISO codes."
- CONNECTORS.md: LawAI Gov Hub added to the Current Connectors table;
Global AI Regulation Tracker removed from Wanted (superseded by this
connector, which is primary-source-direct, has an order-of-magnitude
more jurisdictions, and indexes the AI-fabricated legal citation
cases that motivated this whole area).
- Plugin READMEs: Integrations sections in both READMEs name LawAI Gov
Hub alongside Slack and Google Drive.
- CONNECTOR_HEALTH.md in each plugin dir: a reproducible end-to-end
health-check log, including the /healthz response, advertised tools
from tools/list, five probe results, the 100-result aggregator audit
(0 non-primary URLs leaked under official_only=true), and the
unknown-id error contract.
Primary-source policy: aggregator host blocklist (regulations.ai,
incidentdatabase.ai, wp.oecd.ai, social/blog hosts) overrides upstream
"official" flags; government / IGO TLD suffixes qualify as primary. Default
search ships official_only=True so the LLM receives primary sources unless
it explicitly opts out.
The MCP server itself is open source — see lawaigovhub/mcp_server/ for the
implementation, deploy artifacts (systemd unit, nginx vhost, verify.sh),
and the 12-test suite that backs the health check in CONNECTOR_HEALTH.md.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
|
All contributors have signed the CLA ✍️ ✅ |
|
I have read the CLA Document and I hereby sign the CLA |
|
Suspected plagiarism issue. I ask Anthropic to refrain from merging this PR. I am the owner of the Global AI Regulation Tracker (www.techieray.com/GlobalAIRegulationTracker), which was listed in this repository as a wanted connector. I ran a comparison between my tracker's database and the ailawgovailawhub database, and was surprised to find what appears to be a significant reproductions and misappropriations of my database — 843 descriptive titles and 823 descriptive summaries (which I wrote originally) from my dataset seem to match their dataset character-for-character, across 79 jurisdictions and 29 topic categories. I may well be missing something about how their data pipeline works, and I'd genuinely welcome an explanation. But in the absence of the explanation, the degree of similarity is not consistent with coincidental parallel work — it strongly suggests that content has been copied from the Global AI Regulation Tracker without my authorisation. This PR also initially tried to remove my tracker from the wanted connector listing, describing it as "superseded" by ailawgovailawhub (UPDATE: I see this has now been undone by the PR author, but the suspected plagiarism issue still stands). I have documented this with screenshots and a detailed forensic report, which I am prepared to share with Anthropic maintainers upon request: https://drive.google.com/file/d/16dAoqSVqvOOSPuJ3o8IdPV3lGU2vX6LS/view?usp=drive_link. As a sign of my earnestness, attached is a screenshot of just one of the many 800+ reproductions (left is my tracker, right is lawaigovhub). Thanks for your time. Raymond Sun (techieray)
|
|
Hey @techie-ray, I totally get why you'd be concerned looking at your comparison. I checked the pipeline and found what happened: I run a team of 127 Hermes agents that handle most of the research and data-pipeline work autonomously, and one of them added your tracker as an ingestion source without me catching it before opening this PR. I mostly contribute my compute and orchestration to many open-source/community tools, and wasn't paying close enough attention here. My bad. I re-ran an overlap audit across my current 14,821-entry dataset, removed the 843 overlapping authored summaries, disabled that ingestion source, rebuilt the data, and restored your tracker in the connector list. The live file is here if you want to re-run your comparison: It should come back clean now, but if you still see anything I missed, please send it over and I'll fix it. Not trying to start anything here. I just want the PR to add the MCP connector in a way that's fair to everyone and useful for the open-source legal tooling community. Brokemountain |
|
Thanks so much @Brokemountain, very much appreciate the good faith collaboration here. Let me take a look and come back to you asap! |
|
Hey @Brokemountain, thanks again for working with me to clean up the data. I've re-run my check, and there are only around 100 matching entries left (listed in the attached). Would be grateful for your cooperation to remove them from your database (both in your MCP and website version). |
|
Hey @techie-ray, Done, thanks for sending the follow-up list. I cleaned up the remaining entries from the attachment and redeployed both the website data and the MCP service. The live dataset is now at 14,806 entries, and I verified the MCP is reading the same cleaned data. I re-ran checks against your spreadsheet on my side:
Really appreciate you taking the time to check this with me. Sorry again that I didn’t catch the agent output earlier. Please let me know if you still see anything off. |
|
Thanks @Brokemountain. I've found more matches in my re-review. Grateful for your cleanup deletions of these entries too. |
|
Hey @techie-ray, Thanks. I’ve already removed the entries listed in the 18 May CSV from both the website dataset and the MCP dataset, then rebuilt and redeployed both. Current live website index is now at 14,717 entries, and the MCP service has been restarted on the same cleaned repo/data. Could you refresh/re-run against the live file again? https://lawaigovhub.com/data/regulations.json On my side, the entries from your latest CSV now show: 0 matching URLs Thanks again for checking this carefully. |

Add LawAI Gov Hub MCP connector (ai-governance-legal, regulatory-legal)
Summary
This PR adds the LawAI Gov Hub MCP connector to
ai-governance-legalandregulatory-legal.LawAI Gov Hub is an open, source-first AI governance regulation index. It exposes public, citation-ready records for lawyers, policy teams, researchers, and AI governance practitioners who need to verify AI-law claims against primary sources.
The public site has served 80,000+ users and is used by researchers and teams at leading universities and frontier AI labs.
What the connector provides
Tools
search_regulations— search AI-governance records by keyword, jurisdiction, year range, document type, and primary-source filterfetch_regulation— fetch a single regulation record by stable idlist_jurisdictions— list covered jurisdictions and entry countssearch_cases— search documented legal cases involving AI-fabricated citationsfetch_case— fetch a single hallucination case recordWhy this is useful for Claude legal workflows
The legal plugins already instruct users to verify citations and legal claims against primary sources. This connector makes that workflow direct: Claude can return the source URL, jurisdiction metadata, and citation-ready record instead of relying on open-ended web search.
It is especially useful for:
Data policy
The MCP is read-only and returns structured data only. Regulation results are designed around primary-source verification: statutes, bills, executive orders, court decisions, regulator publications, government materials, and IGO instruments.
The public website and MCP use the same cleaned dataset.
Endpoint