ChatGPT’s Advanced Voice Mode finally gets visual context on the 6th day of OpenAI
As the holidays approach, many companies are looking for ways to capitalize on the opportunity through sales, promotions or other events. OpenAI has found a way to participate in its “12 Days of OpenAI” event series.
On Wednesday, OpenAI announced via X posts The company will host a 12-day livestream starting on December 5 and release “a bunch of new stuff big and small,” the post said.
also: OpenAI’s Sora AI video generator is here – how to try it
Here’s everything you need to know about the event, along with a roundup of daily drops.
What is “OpenAI 12 Days”?
OpenAI CEO Sam Altman shared more details about the event, which begins at 10 a.m. PT on December 5 and will run for 12 business days, with a livestream of a release or demo per day. Ultraman said that the products released this time will be “big items” or “Christmas gifts.”
What has been given up so far?
Thursday, December 12
When the livestream began, OpenAI addressed the elephant in the room — the fact that the company’s livestream had been interrupted the day before. OpenAI apologizes for the inconvenience and says its team is conducting a postmortem that will be released later today.
And then it went straight into the news – another highly anticipated announcement:
- Advanced Voice Mode now has screen sharing and vision capabilities, which means it can help understand what you’re looking at, whether it’s from your phone’s camera or what’s on your screen.
- These features build on what Advanced Speech already does well – have casual conversations like a human. Natural conversations can be interrupted, take multiple turns, and understand non-linear ideas.
- In the demo, users get guidance from ChatGPT’s advanced voice on how to brew coffee. As the presenter completes the steps, ChatGPT verbally provides insights and guidance.
- There’s another bonus for Christmas: a new Santa Claus voice is available to users. To launch it, all the user has to do is click on the snowflake icon. Santa is launching today everywhere users can access ChatGPT voice mode. The first time you talk to Santa, your usage limit will reset so you can talk to him even if you’ve reached the limit.
- Starting today through next week, video and screen sharing will be rolling out to all Team users and most Pro and Plus subscribers in the latest mobile app. Pro and Plus users in Europe will get access “as soon as possible”, while Enterprise and Edu users will get access early next year.
Wednesday, December 11
Apple releases iOS 18.2 today. This version includes Chat GPT Covers Siri, writing tools, and visual intelligence. So today’s live broadcast focuses on integration.
- Siri can now recognize when you ask a question that’s outside its scope, and those questions can benefit from ChatGPT’s answers. In these cases, it will ask you if you want to use ChatGPT to handle the query. Before any request is sent to ChatGPT, a message notifying the user and requesting permission always appears, putting control in the user’s hands as much as possible.
- Visual Intelligence refers to a new feature of the iPhone 16 series that users can access by clicking the “Camera Control” button. Once the camera is on, users can point it at something and search the web using Google, or use ChatGPT to learn more about what they’re viewing or perform other tasks, such as translating or summarizing text.
- Writing Tools now features a new Compose tool that allows users to create text from scratch using ChatGPT. With this feature, users can even use DALL-E to generate images.
All of the above features are subject to ChatGPT’s daily usage limit, just as users would reach the limit when using the free version of the model on ChatGPT. Users can choose whether to enable ChatGPT integration in “Settings”.
Read more about it here: iOS 18.2 coming to iPhone: Try these 6 new AI features now
Tuesday, December 10
- canvas Regardless of the plan, all Internet users will be GPT-4owhich means it’s no longer only available in beta Chat GPT Plus user.
- Canvas is built natively into GPT-4o, which means you can just call Canvas instead of having to go to the toggle button on the model selector.
- The Canvas interface is the same as what users saw in the beta version of ChatGPT Plus, with a table on the left showing Q+A exchanges, and tabs on the right showing your project, showing all edits, and shortcuts.
- Canvas can also be used with Custom GPT. This feature is enabled by default when creating a new GPT, and you can optionally add a Canvas to an existing GPT.
- Canvas can also run Python code directly in Canvas, allowing Chat GPT Perform coding tasks such as fixing bugs.
Read more about it here: I’m a ChatGPT power user – one month later, Canvas is still my favorite productivity feature
Monday, December 9
OpenAI teased the day three launch as “the thing you’ve been waiting for,” and it follows the release of the highly anticipated video model Sora. Here’s what you need to know:
- called solar turbinethe video model is smarter than the February model previewed.
- It will launch in the US later today; users only need ChatGPT Plus and Pro.
- Sora can generate video to video, text to video, etc.
- ChatGPT Plus users can produce up to 50 480p resolution videos or fewer 720p resolution videos per month. The professional plan provides 10 times more usage.
- The new model is smarter and cheaper than the one previewed in February.
- Sora features an explore page where users can view each other’s creations. Users can click on any video to see how it was created.
- A live demonstration demonstrated the model in use. Presenters enter prompts and select aspect ratios, durations, and even presets. I found the live demo video results to be authentic and stunning.
- OpenAI also launched Storyboard, a tool that allows users to generate input for each screen in a sequence.
Friday, December 6:
On day two of “shipmas,” OpenAI expanded access to its enhanced fine-tuning research program:
- OpenAI says the enhanced fine-tuning program allows developers and machine learning engineers to fine-tune OpenAI models to “skill at specific sets of complex, domain-specific tasks.”
- Enhanced fine-tuning is a customization technique that allows developers to define the behavior of a model by inputting tasks and grading the output. The model then uses this feedback as an improvement guide to better reason about similar problems and improve overall accuracy.
- OpenAI encourages research institutions, universities and businesses to apply for the program, especially those performing narrow and complex tasks that can benefit from the help of artificial intelligence and perform tasks with objectively correct answers.
- Seats are limited; interested applicants can apply by filling out this form.
- OpenAI aims to make the enhancement fine-tuning publicly available in early 2025.
Thursday, December 5:
OpenAI started with a bang, launching two major upgrades to its chatbot: Chat GPT Subscription, ChatGPT Pro and the company’s full version o1 model.
The full version of o1:
- Would be better for all kinds of prompts besides math and science
- Compared with o1-preview, the frequency of major mistakes is reduced by about 34%, and the thinking speed is 50% faster.
- Launched today, replacing all o1-previews Chat GPT Plus Now a professional user
- Allows users to input images (as shown in the demo) to provide multimodal reasoning (reasoning over text and images)
ChatGPT Professional Edition:
- Available to ChatGPT Plus super users, giving them unlimited access to the best features OpenAI has to offer, including unlimited access to OpenAI o1-mini, GPT-4o and advanced modes
- Featuring o1 pro mode, which uses more calculations to reason about the hardest science and math problems
- Monthly fee $200
Where can I watch the live broadcast?
The live broadcast took place on the OpenAI website and was immediately posted to its YouTube channel. For easier access, OpenAI will also post a link to the live broadcast on its X account 10 minutes before the live broadcast starts, which will take place daily at approximately 10am (Pacific Time) / 1pm (Eastern Time).
What can you expect?
These releases are still surprising, but many expected solaOpenAI’s video model was originally announced last February and would be rolled out as part of one of the larger releases. The model has been available to a select group of red team members and testers since it was first announced, and was leaked last week by some testers unhappy with “unpaid labor.” It is reported.
also: OpenAI’s o1 lies more than any major AI model. why this is important
Other rumored releases include a new, more complete version of the company’s o1 LLM with more advanced inference capabilities, as well as OpenAI’s Advanced Speech Mode for Santa’s voice, Every code found Just a few weeks ago, the user was using the codename “Straw.”
2024-12-12 18:54:00