Sophon aggregates AI research artefacts - papers, evals, models, tools - into a single readable surface. Every piece of content has a declared hosting policydriven by its license. We err on the conservative side: if we can't confidently confirm a license permits redistribution, we link out instead.
Hosting tiers
Permissive licenses - CC-BY, CC0, MIT, Apache-2.0, public domain. Rendered natively with attribution.
Non-commercial Creative Commons variants (CC-BY-NC, CC-BY-NC-SA). A short fair-use snippet plus link to the source.
arXiv default license and similar non-exclusive distribution arrangements. Abstract and metadata only - full PDF lives on the author's host.
All-rights-reserved, proprietary, and unknown licenses. Metadata + link out only - no extracted content stored.
How we detect licenses
We read the license each source declares - arXiv's license field, a Hugging Face model or dataset card's license, or the license file in a GitHub repository. arXiv's default license is treated as notCreative Commons, so those papers stay abstract-only. When no license is declared, or we can't confirm one, we fall back to the most conservative tier - metadata and a link out, nothing extracted.
What you can take down
If you are an author or rights-holder and want content removed or re-tiered, see the DMCA / takedown page. We respond within 72 hours.
What you cannot find here
- Closed-license papers as full text - abstract + link only.
- Hugging Face datasets in their entirety - sample rows only.
- Vendor blog posts or Twitter threads.
- Code from non-permissive repositories.