Anthropic Puts Claude Code in Your Pocket Wit
Anthropic has released a new mobile feature for Claude ...
Perceptron Inc. has launched Perceptron Mk1, a proprietary video analysis reasoning model designed to understand real-world video, live feeds, spatial relationships, object dynamics, and cause-and-effect events.
The headline claim is aggressive: Mk1 is priced at $0.15 per million input tokens and $1.50 per million output tokens, which the article says is 80–90% cheaper than leading proprietary rivals from Anthropic, OpenAI, and Google.
The model is positioned as a major step toward “physical AI”, AI that does not just understand text or static images, but can reason over movement, time, objects, and real-world scenes. Perceptron says this could open up practical enterprise uses in security, robotics, manufacturing, sports video clipping, research analysis, smart glasses, and content moderation.
Perceptron Mk1 is built for video understanding, especially situations where an AI model needs to analyze live or extended video streams rather than isolated still frames.
The model was developed over 16 months under the leadership of co-founder and CEO Armen Aghajanyan, formerly of Meta FAIR and Microsoft.
Mk1 is presented as highly competitive on spatial and video reasoning benchmarks. In EmbSpatialBench, it scored 85.1, beating Google’s Robotics-ER 1.5 and Alibaba’s Q3.5-27B. In RefSpatialBench, it scored 72.4, far ahead of GPT-5m and Sonnet 4.5 according to the article.
On video benchmarks, Mk1 scored 41.4 on the EgoSchema “Hard Subset” and 88.5 on VSI-Bench, described as the highest recorded score among the compared models.
A major differentiator is cost. Perceptron says Mk1 can match or exceed “frontier” models like GPT-5 and Gemini 3.1 Pro while staying closer to the pricing profile of lighter models.
Technically, Mk1 can process native video at up to 2 frames per second across a 32K token context window. This allows it to maintain temporal continuity and track objects across time, even through occlusions.
The model can return structured time codes, making it useful for video clipping, event detection, and finding specific moments inside long video streams.
Perceptron emphasizes Mk1’s physical reasoning capabilities, including understanding how objects move through space and time. One example given is determining whether a basketball shot happened before or after a buzzer by reasoning about the ball’s position and the shot clock.
The model can perform “pixel-precise” pointing, count dense scenes into the hundreds, and read analog gauges and clocks.
The launch also includes an expanded developer platform and Python SDK, with features such as Focus, Counting, and In-Context Learning.
Perceptron is using a dual-track model strategy: Mk1 is closed-source and API-based, while its Isaac series offers open-weights models for edge and low-latency deployments.
The company is based in Bellevue, Washington and was founded by Aghajanyan and Akshat Shrivastava, both former Meta FAIR researchers.
Early partners are reportedly using the model for sports highlights, robotics training data, manufacturing quality control, and wearable smart-glasses assistants.
“AI that can see and understand what’s happening in a video, especially a live feed, is understandably an attractive product to lots of enterprises and organizations.”
“This launch signals a new era where models are expected to understand cause-and-effect, object dynamics, and the laws of physics with the same fluency they once applied to grammar.”
“Perceptron has explicitly targeted the ‘Efficiency Frontier,’ a metric that plots mean scores across video and embodied reasoning benchmarks against the blended cost per million tokens.”
“Unlike traditional vision-language models (VLMs) that often treat video as a disjointed sequence of still images, Mk1 is designed for temporal continuity.”
“This requires more than just pattern recognition; it requires an understanding of how objects move through space and time.”
“Aghajanyan stated that these releases are the culmination of research intended to make AI function best in the physical world, moving toward a future where ‘physical AI’ is as ubiquitous as digital AI.”
Perceptron Mk1 could make advanced video AI more accessible to enterprises by lowering the cost barrier. The article frames this as important because high-end video reasoning has not yet become a mainstream capability, even though many industries have clear use cases for it.
For security, surveillance, robotics, manufacturing, and sports media, Mk1’s ability to track objects over time and reason about physical events could reduce manual review, labeling, and monitoring work.
For developers, the SDK and features like Focus, Counting, and In-Context Learning suggest Perceptron wants Mk1 to be more than a benchmark model. The company is trying to turn video reasoning into practical applications that can be built with minimal code.
For the broader AI market, the launch puts pressure on larger proprietary AI providers by claiming comparable or stronger performance at dramatically lower pricing. If the article’s benchmark and pricing claims hold in real-world deployments, Perceptron could become a serious player in the emerging physical AI category.