Tech

OpenAI Is Asking Contractors to Upload Work From Past Jobs to Evaluate the Performance of AI Agents

Published

3 months ago

January 10, 2026

OpenAI Is Asking Contractors to Upload Work From Past Jobs to Evaluate the Performance of AI Agents

OpenAI is asking third-party contractors to upload real assignments and tasks from their current or previous workplaces so that it can use the data to evaluate the performance of its next-generation AI models, according to records from OpenAI and the training data company Handshake AI obtained by WIRED.

The project appears to be part of OpenAI’s efforts to establish a human baseline for different tasks that can then be compared with AI models. In September, the company launched a new evaluation process to measure the performance of its AI models against human professionals across a variety of industries. OpenAI says this is a key indicator of its progress towards achieving AGI, or an AI system that outperforms humans at most economically valuable tasks.

“We’ve hired folks across occupations to help collect real-world tasks modeled off those you’ve done in your full-time jobs, so we can measure how well AI models perform on those tasks,” reads one confidential document from OpenAI. “Take existing pieces of long-term or complex work (hours or days+) that you’ve done in your occupation and turn each into a task.”

OpenAI is asking contractors to describe tasks they’ve done in their current job or in the past and to upload real examples of work they did, according to an OpenAI presentation about the project viewed by WIRED. Each of the examples should be “a concrete output (not a summary of the file, but the actual file), e.g., Word doc, PDF, Powerpoint, Excel, image, repo,” the presentation notes. OpenAI says people can also share fabricated work examples created to demonstrate how they would realistically respond in specific scenarios.

OpenAI and Handshake AI declined to comment.

Real-world tasks have two components, according to the OpenAI presentation. There’s the task request (what a person’s manager or colleague told them to do) and the task deliverable (the actual work they produced in response to that request). The company emphasizes multiple times in instructions that the examples contractors share should reflect “real, on-the-job work” that the person has “actually done.”

One example in the OpenAI presentation outlines a task from a “Senior Lifestyle Manager at a luxury concierge company for ultra-high-net-worth individuals.” The goal is to “Prepare a short, 2-page PDF draft of a 7-day yacht trip overview to the Bahamas for a family who will be traveling there for the first time.” It includes additional details regarding the family’s interests and what the itinerary should look like. The “experienced human deliverable” then shows what the contractor in this case would upload: a real Bahamas itinerary created for a client.

OpenAI instructs the contractors to delete corporate intellectual property and personally identifiable information from the work files they upload. Under a section labeled “Important reminders,” OpenAI tells the workers to “Remove or anonymize any: personal information, proprietary or confidential data, material nonpublic information (e.g., internal strategy, unreleased product details).”

One of the files viewed by WIRED document mentions an ChatGPT tool called “Superstar Scrubbing” that provides advice on how to delete confidential information.

Evan Brown, an intellectual property lawyer with Neal & McDevitt, tells WIRED that AI labs that receive confidential information from contractors at this scale could be subject to trade secret misappropriation claims. Contractors who offer documents from their previous workplaces to an AI company, even scrubbed, could be at risk of violating their previous employers’ non-disclosure agreements, or exposing trade secrets.

“The AI lab is putting a lot of trust in its contractors to decide what is and isn’t confidential,” says Brown. “If they do let something slip through, are the AI labs really taking the time to determine what is and isn’t a trade secret? It seems to me that the AI lab is putting itself at great risk.”

Source link

Up Next

Save 50% at Total Wireless, Even Without a Promo Code

Don't Miss

The Samsung Galaxy Watch Is Discounted on Amazon

Click to comment

Tech

Bremont Is Sending a Watch to the Moon’s Surface

Published

3 hours ago

April 14, 2026

cineplex360

Bremont Is Sending a Watch to the Moon’s Surface

A multifaceted decahedral black ceramic bezel and sandwich-style three-piece case—a reworking of Bremont’s signature Trip-Tick construction—house a chronometer-rated automatic chronograph movement made by Sellita, with a 62-hour power reserve.

The watch will be a passenger aboard the FLIP rover, due to launch as part of Astrobotic’s Griffin Mission One (Griffin-1), expected to land at the lunar south pole at some point in the second half of this year.

It’s a one-way mission: The rover will remain permanently on the lunar surface, with the watch ticking away as it roams the landscape. FLIP’s objectives include reaching elevated positions on the lunar terrain, gathering data on lunar dust accumulation, testing dust-mitigation coatings, and surviving a two-week lunar night in hibernation (which would be a first for a US rover).

In terms of serious timekeeping data for Bremont, the mission is frankly symbolic. The watch will be positioned vertically in a specially designed housing within the FLIP’s chassis, between its front wheels. Only the watch head, weighing 107 grams, is included, glued in place using a specialist composite, its face visible to FLIP’s HD cameras. But the hibernatory periods will mean the watch (whose mechanical movement is driven in normal circumstances by the motion of the wearer’s arm) will stop running once its 62-hour power reserve runs down.

When the FLIP is on the move again, its motion should—in theory—jolt the mechanism into action once more. Despite the gravitational pull that’s a sixth of the Earth’s, the acceleration, pitches, and tilts of the rover should swing the winding rotor, if with less torque and efficiency than on Earth.

“My guess is that the watch will function from time to time, but for short periods,” Cerrato says. “We will learn along the way. But that’s what is exciting—it projects us into a thinking process that is absolutely out of the box. Just the fact of having it there is inspiring.” However, there is little doubt that Bremont will, just like other brands with any ties to the cosmos, mine its new space connection for all it is worth.

FLIP itself, which weighs just 1,058 pounds and carries a mix of commercial and government payloads, four HD cameras, and a deployable solar array, is fundamentally a technology demonstrator for Flexible Logistics and Exploration (FLEX), Astrolab’s much larger SUV-sized rover destined to support NASA’s Artemis program. The firm developed the FLIP from scratch after NASA’s equivalent vehicle for which the Griffin-1 mission was contracted, the VIPER, was put on pause in 2024. This left Astrobotic seeking a stand-in in short order. Astrolab, which signed the contract within a month of hearing about the opportunity in the fall of 2024, took the FLIP from blank sheet to finished rover in roughly a year.

Its standout feature is its hyper-deformable wheels, minutely structured from silicone, composite, and stainless steel, which create a soft, enlarged contact surface with the terrain. “It’s like if you’re off-roading in a Jeep or Land Rover where you let some air out of the tires to go softer and spread the load over a larger area,” explains Astrolab’s founder, Jaret Matthews. While the moon’s nighttime temperatures of around -200 degrees Celsius (around -328 Fahrenheit) would cause conventional rubber tires to become glass-like and shatter, Astrolab’s solution is intended to keep the rover from sinking into the unconsolidated lunar dust—or regolith—that covers the environment.

Source link

Tech

Novo Nordisk partners with OpenAI to AI-power drug development | Computer Weekly

Published

4 hours ago

April 14, 2026

cineplex360

Novo Nordisk partners with OpenAI to AI-power drug development | Computer Weekly

Danish pharmaceutical company Novo Nordisk has partnered with OpenAI to support drug research and development. Through the partnership, Novo Nordisk said it plans to deploy advanced artificial intelligence (AI) capabilities to analyse complex datasets, identify promising drug candidates and reduce the time required to move from research to patient.

The company said its use of AI has been structured with strict data protection, governance and human oversight to ensure ethical and compliant use. This latest partnership is being positioned as a key part of the company’s strategy to use AI to transform healthcare and enable it to bring new and better treatment options to patients faster.

In 2024, a break-out session run during its Capitals Market Day presented Novo Nordisk’s strategy, discussing how it uses data science and AI and its future plans. The presentation shows that the company set up an AI centre of excellence in 2021, and had begun ramping up investment in high performance computing and graphics processor units (GPUs) by 2023. The company said it has deployed a data pool called FounData, where all data from completed clinical trials are pooled and prepared for insights-generation.

It has also deployed NovoScribe, an AI-powered platform built using MongoDB Atlas Vector Search, Amazon Bedrock and LangChain to automate and accelerate the creation of clinical study reports. Novo Nordisk said NovoScribe reduces the time to regulatory submissions.

At the time, the company said external partnerships and collaborations would continue to play an important role in reaching its AI ambitions.

Earlier this year, Christos Nicolaou, a senior scientific director at Novo Nordisk, posted on LinkedIn that the company has now joined Ligand-AI, a new project funded by the EU public-private partnership, Innovative Health Initiative (IHI).

In the post, he said the project’s goal is to generate high quality, large, open datasets of protein-ligand interactions for thousands of proteins. “In the spirit of open science collaboration, these datasets will be shared and used to implement models and methods to improve AI-driven drug discovery,” he said.

This latest partnership with OpenAI builds on technology partnerships it has with AWS, Microsoft, Google and Hugging Face, as well as its existing collaboration with OpenAI.

“This partnership is one important step in positioning Novo Nordisk to lead in the next era of healthcare,” said Mike Doustdar, president and CEO of Novo Nordisk. “There are millions of people living with obesity and diabetes who need treatment options, and we know there are therapies still waiting to be discovered that could change their lives.

“Integrating AI in our everyday work gives us the ability to analyse datasets at a scale that was previously impossible, identify patterns we could not see, and test hypotheses faster than ever. This means discovering new therapies and bringing them to market faster than ever before.”

OpenAI said it would be assisting Novo Nordisk in upskilling the company’s global workforce and enhancing AI literacy. Through the partnership OpenAI’s capabilities will also be used to improve efficiency in manufacturing, supply chain and distribution, and corporate operations. The company is starting pilot programmes across research and development, and manufacturing and commercial operations, with full integration by the end of 2026.

Source link

Tech

Flood warning: How citizens’ AI agents will swamp public services | Computer Weekly

Published

6 hours ago

April 14, 2026

cineplex360

Flood warning: How citizens’ AI agents will swamp public services | Computer Weekly

The people running UK public services are busy working out how artificial intelligence (AI) might improve things.

There’s some good stuff happening, like tools to digitise planning information, transcribe probation officers’ conversations or rapidly assess stroke victims. There’s some nicely radical thinking coming out of various pockets of the Government Digital Service and the Department for Science, Innovation and Technology. Teams across government are running countless experiments.

But what if governments are looking through the AI telescope from the wrong end? What if citizens’ own use of AI to access public services proves to be an even more transformative force?

CinePlex360

OpenAI Is Asking Contractors to Upload Work From Past Jobs to Evaluate the Performance of AI Agents

Tech

OpenAI Is Asking Contractors to Upload Work From Past Jobs to Evaluate the Performance of AI Agents

Leave a Reply
Cancel reply

Leave a Reply

Tech

Bremont Is Sending a Watch to the Moon’s Surface

Tech

Novo Nordisk partners with OpenAI to AI-power drug development | Computer Weekly

Tech

Flood warning: How citizens’ AI agents will swamp public services | Computer Weekly

North India cotton yarn prices steady amid slow demand

Ryan Reynolds ‘unwantedly’ caught in Blake Lively legal mess

Transfer rumors, news: Bayern, Barça eye move for Milan’s Leão

India’s exports face reset as EU links trade to carbon metrics: EY

Queen Elizabeth II emotional message for Archie, Lilibet sparks speculation

As the Strait of Hormuz Reopens, Global Shipping Will Take Months to Recover

Illinois’ financial crisis could bring the state to a halt

The final 6 ‘Game of Thrones’ episodes might feel like a full season

New Season 8 Walking Dead trailer flashes forward in time

Trending

CinePlex360

OpenAI Is Asking Contractors to Upload Work From Past Jobs to Evaluate the Performance of AI Agents

You may like

Leave a Reply Cancel reply

Leave a Reply

Tech

Bremont Is Sending a Watch to the Moon’s Surface

Tech

Novo Nordisk partners with OpenAI to AI-power drug development | Computer Weekly

Tech

Flood warning: How citizens’ AI agents will swamp public services | Computer Weekly

Creating friction

Ask your agent

Agentic flooding

North India cotton yarn prices steady amid slow demand

Ryan Reynolds ‘unwantedly’ caught in Blake Lively legal mess

Transfer rumors, news: Bayern, Barça eye move for Milan’s Leão

India’s exports face reset as EU links trade to carbon metrics: EY

Queen Elizabeth II emotional message for Archie, Lilibet sparks speculation

As the Strait of Hormuz Reopens, Global Shipping Will Take Months to Recover

Illinois’ financial crisis could bring the state to a halt

The final 6 ‘Game of Thrones’ episodes might feel like a full season

New Season 8 Walking Dead trailer flashes forward in time

Trending

Leave a Reply
Cancel reply