Free GPT‑OSS: Local Model with Web Search ≈ GPT‑4‑Mini

Free GPT‑OSS: Local Model with Web Search ≈ GPT‑4‑Mini

Brief Summary

This video explores the new GPT-OSS models released by OpenAI, focusing on the 20 billion parameter version and its performance within a local machine using Ollama. It highlights the model's reasoning capabilities, tooling support (specifically search), and compares its code generation abilities to the Claude Opus 4.1. The video demonstrates how the GPT-OSS model can perform tasks like answering real-time questions and generating code with dependency injection and Docker integration, showcasing its potential for local, edge-based computing.

  • The GPT-OSS 20 billion parameter model can run on machines with 16GB RAM, making it accessible for local use.
  • The model supports tooling, including a search capability, enabling it to access real-time information.
  • It can generate code with features like dependency injection and Docker integration, comparable to higher-level models.

Introduction

The video introduces the new GPT-OSS models released by OpenAI, available in 120 billion and 20 billion parameter versions. These open-weight language models offer reasoning capabilities and are licensed under Apache 2.0, allowing flexible use. The models are claimed to be close to GPT-3 in performance, with tooling support. The 120 billion parameter model has 36 layers of weights and 117 billion parameters, requiring significant GPU resources, while the 20 billion parameter model can run on machines with at least 16GB of RAM. The presenter aims to demonstrate the 20 billion parameter model's capabilities using Ollama on their local machine.

Ollama with GPT-OSS

The presenter transitions to the Ollama interface, highlighting the new search capability available with the GPT-OSS 20 billion parameter model. This feature is not available with models like Deepseek R1 or GMA 3. The search function allows the model to access real-time information and incorporate it into its responses. While there's also a turbo feature, it requires a paid license. The presenter plans to demonstrate the search capability and compare the model's coding abilities with Claude Opus 4.1.

GPT-OSS with Search Tool

The presenter demonstrates the search capability of the GPT-OSS 20 billion parameter model by asking about the current weather in Auckland, New Zealand. The model performs an online search to retrieve this real-time information, a feature not available in models without tooling support. Although the initial result provides incorrect date information, it still manages to fetch the current temperature and time. The presenter then plans to test the model's coding capabilities by using a command previously used with Claude Opus 4.1 to generate Playwright code.

Claude Opus 4.1 vs GPT-OSS running in Ollama

The presenter compares the GPT-OSS model's code generation with that of Claude Opus 4.1 by using the same prompt: to write a Playwright C# .NET framework code with dependency injection, page object model, separation of concerns, extended framework, and Docker container integration. Claude Opus 4.1 previously generated a comprehensive code structure with project files, app settings, dependency injection, and Docker container setup. The presenter runs the same command on the local GPT-OSS model to observe its performance. The model begins processing the request, utilising its search capability to gather information on dependency injections and Playwright C# test runners. It identifies the need for a Docker file and searches online for Docker setup information. The model then generates a project overview, including xUnit tests, fixtures for Playwright with dependency injections, and page object model code for login, application, and upload pages. It creates configurations with an app settings.json file and a Docker image for containerisation. The generated code structure includes CS files, app settings, program fixtures, collections, base page, login page, application page, file helper, locator helper, test data, and application data. The presenter notes the code is well-structured and includes key points and information for each operation. Despite the laptop fan working hard, the presenter is impressed with the model's reasoning and code scaffolding abilities, considering it a significant improvement compared to earlier models. The presenter concludes by highlighting the GPT-OSS model's features, including function calling, web browsing (search), Python tool calls, structured outputs, configurable reasoning effort, and fine-tunable details, all under the Apache 2.0 license.

Share

Summarize Anything ! Download Summ App

Download on the Apple Store
Get it on Google Play
© 2024 Summ