<?xml version="1.0" encoding="utf-8" standalone="yes"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:content="http://purl.org/rss/1.0/modules/content/">
  <channel>
    <title>Ai on DevOps Way - Практические гайды</title>
    <link>https://devopsway.ru/tags/ai/</link>
    <description>Recent content in Ai on DevOps Way - Практические гайды</description>
    <image>
      <title>DevOps Way - Практические гайды</title>
      <url>https://devopsway.ru/images/devopsway-og.png</url>
      <link>https://devopsway.ru/images/devopsway-og.png</link>
    </image>
    <generator>Hugo -- 0.161.1</generator>
    <language>ru</language>
    <lastBuildDate>Thu, 07 May 2026 09:08:35 -0400</lastBuildDate>
    <atom:link href="https://devopsway.ru/tags/ai/feed.xml" rel="self" type="application/rss+xml" />
    <item>
      <title>RAG Pipeline 1/N: Qdrant — векторная база данных для AI</title>
      <link>https://devopsway.ru/posts/rag-01-qdrant-vectors/</link>
      <pubDate>Thu, 07 May 2026 12:00:00 +0300</pubDate>
      <guid>https://devopsway.ru/posts/rag-01-qdrant-vectors/</guid>
      <description>Зачем AI нужна векторная база данных, как работает Qdrant, косинусная близость на пальцах. Практика: запускаем Qdrant в Docker, создаём коллекцию, делаем семантический поиск за 15 минут.</description>
      <content:encoded><![CDATA[<table>
  <thead>
      <tr>
          <th>Параметр</th>
          <th>Значение</th>
      </tr>
  </thead>
  <tbody>
      <tr>
          <td>Bloom</td>
          <td>L3–L4 (Применение → Анализ)</td>
      </tr>
      <tr>
          <td>SFIA</td>
          <td>Уровень 2–3</td>
      </tr>
      <tr>
          <td>Dreyfus</td>
          <td>Advanced Beginner → Competent</td>
      </tr>
      <tr>
          <td>Артефакт</td>
          <td>docker-compose.yml + скрипт проверки</td>
      </tr>
      <tr>
          <td>Проверка</td>
          <td><code>curl localhost:6333/healthz</code> → <code>ok</code>, семантический поиск работает</td>
      </tr>
  </tbody>
</table>
<hr>
<h2 id="tldr">TL;DR</h2>
<p>AI-модель забывает всё после каждой сессии. Векторная база данных решает эту проблему &ndash; хранит знания в виде чисел и находит похожее по смыслу, а не по ключевым словам.</p>
<hr>
<h2 id="проблема-ai-без-памяти">Проблема: AI без памяти</h2>
<p>Каждая сессия с LLM начинается с чистого листа:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-fallback" data-lang="fallback"><span class="line"><span class="cl">БЕЗ RAG:                          С RAG:
</span></span><span class="line"><span class="cl">┌──────────┐                      ┌──────────┐
</span></span><span class="line"><span class="cl">│   LLM    │──→ Ollama            │   LLM    │──→ Ollama
</span></span><span class="line"><span class="cl">│          │                      │   + RAG  │
</span></span><span class="line"><span class="cl">└──────────┘                      └────┬─────┘
</span></span><span class="line"><span class="cl">     │                                 │
</span></span><span class="line"><span class="cl">  закрыл                            закрыл
</span></span><span class="line"><span class="cl">  сессию                            сессию
</span></span><span class="line"><span class="cl">     │                                 │
</span></span><span class="line"><span class="cl">     ▼                                 ▼
</span></span><span class="line"><span class="cl">┌──────────┐                      ┌──────────┐
</span></span><span class="line"><span class="cl">│   LLM    │  &#34;Я ничего не        │   LLM    │  &#34;Да, в прошлый
</span></span><span class="line"><span class="cl">│          │   помню&#34;             │   + RAG  │   раз мы делали X&#34;
</span></span><span class="line"><span class="cl">└──────────┘                      └────┬─────┘
</span></span><span class="line"><span class="cl">                                       │
</span></span><span class="line"><span class="cl">                                  ┌────▼─────┐
</span></span><span class="line"><span class="cl">                                  │  Qdrant  │ ← вектора
</span></span><span class="line"><span class="cl">                                  └──────────┘
</span></span></code></pre></div><p>Вы потратили час объясняя модели архитектуру проекта. Закрыли терминал. Открыли снова &ndash; всё с нуля.</p>
<p>PostgreSQL здесь не поможет: она ищет по точному совпадению. Запрос &ldquo;как настроить reverse proxy&rdquo; не найдёт документ, в котором написано &ldquo;проксирование запросов через nginx&rdquo;. <strong>Разные слова, один смысл</strong> &ndash; это задача для векторного поиска.</p>
<hr>
<h2 id="что-такое-qdrant">Что такое Qdrant</h2>
<p>Qdrant (произносится &ldquo;квадрант&rdquo;) &ndash; векторная база данных. Хранит данные не как строки таблицы, а как точки в многомерном пространстве.</p>
<h3 id="три-ключевых-концепции">Три ключевых концепции</h3>
<p><strong>1. Коллекция</strong> &ndash; аналог таблицы в PostgreSQL. Но вместо колонок и строк &ndash; набор точек (points) с векторами.</p>
<p><strong>2. Вектор</strong> &ndash; координаты смысла в цифровом пространстве. Модель-эмбеддер превращает текст в массив чисел фиксированной длины. Тексты с похожим значением получают близкие координаты, даже если написаны разными словами:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-fallback" data-lang="fallback"><span class="line"><span class="cl">&#34;Docker контейнер&#34;  → [0.12, -0.34, 0.56, ..., 0.78]   # 384 числа
</span></span><span class="line"><span class="cl">&#34;Контейнеризация&#34;   → [0.11, -0.31, 0.54, ..., 0.76]   # похожий вектор
</span></span><span class="line"><span class="cl">&#34;Рецепт борща&#34;      → [-0.89, 0.45, -0.12, ..., 0.03]  # совсем другой
</span></span></code></pre></div><p><strong>3. Косинусная близость (cosine similarity)</strong> &ndash; мера похожести двух векторов. Не сравнивает длину, только направление:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-fallback" data-lang="fallback"><span class="line"><span class="cl">cosine(&#34;Docker контейнер&#34;, &#34;Контейнеризация&#34;) = 0.94  ← похожи
</span></span><span class="line"><span class="cl">cosine(&#34;Docker контейнер&#34;, &#34;Рецепт борща&#34;)    = 0.12  ← не похожи
</span></span></code></pre></div><p>Чем ближе к единице &ndash; тем больше общего в значении, даже если сами слова разные. 0 &ndash; ничего общего, 1 &ndash; одно и то же. На практике score выше 0.8 &ndash; хорошее совпадение.</p>
<h3 id="payload--метаданные-к-вектору">Payload &ndash; метаданные к вектору</h3>
<p>Каждая точка в Qdrant хранит не только вектор, но и произвольные данные:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-json" data-lang="json"><span class="line"><span class="cl"><span class="p">{</span>
</span></span><span class="line"><span class="cl">  <span class="nt">&#34;id&#34;</span><span class="p">:</span> <span class="s2">&#34;abc-123&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">  <span class="nt">&#34;vector&#34;</span><span class="p">:</span> <span class="p">[</span><span class="mf">0.12</span><span class="p">,</span> <span class="mf">-0.34</span><span class="p">,</span> <span class="err">...</span><span class="p">],</span>
</span></span><span class="line"><span class="cl">  <span class="nt">&#34;payload&#34;</span><span class="p">:</span> <span class="p">{</span>
</span></span><span class="line"><span class="cl">    <span class="nt">&#34;text&#34;</span><span class="p">:</span> <span class="s2">&#34;Для reverse proxy используйте proxy_pass...&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">    <span class="nt">&#34;file_path&#34;</span><span class="p">:</span> <span class="s2">&#34;nginx-guide.md&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">    <span class="nt">&#34;start_line&#34;</span><span class="p">:</span> <span class="mi">45</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">    <span class="nt">&#34;end_line&#34;</span><span class="p">:</span> <span class="mi">89</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">    <span class="nt">&#34;tags&#34;</span><span class="p">:</span> <span class="s2">&#34;nginx, proxy&#34;</span>
</span></span><span class="line"><span class="cl">  <span class="p">}</span>
</span></span><span class="line"><span class="cl"><span class="p">}</span>
</span></span></code></pre></div><p>Payload позволяет модели не только найти релевантный кусок, но и <strong>сослаться на источник</strong>: &ldquo;Согласно nginx-guide.md, строки 45-89&hellip;&rdquo;</p>
<hr>
<h2 id="практика-qdrant-за-15-минут">Практика: Qdrant за 15 минут</h2>
<h3 id="шаг-1-запускаем-qdrant">Шаг 1. Запускаем Qdrant</h3>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-bash" data-lang="bash"><span class="line"><span class="cl">docker run -d <span class="se">\
</span></span></span><span class="line"><span class="cl">  --name qdrant <span class="se">\
</span></span></span><span class="line"><span class="cl">  -p 6333:6333 <span class="se">\
</span></span></span><span class="line"><span class="cl">  -v qdrant-data:/qdrant/storage <span class="se">\
</span></span></span><span class="line"><span class="cl">  qdrant/qdrant:latest
</span></span></code></pre></div><p>Проверяем:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-bash" data-lang="bash"><span class="line"><span class="cl">curl -s http://localhost:6333/healthz
</span></span><span class="line"><span class="cl"><span class="c1"># ok</span>
</span></span></code></pre></div><p>Dashboard доступен в браузере: <code>http://localhost:6333/dashboard</code></p>
<h3 id="шаг-2-создаём-коллекцию">Шаг 2. Создаём коллекцию</h3>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-bash" data-lang="bash"><span class="line"><span class="cl">curl -X PUT http://localhost:6333/collections/demo <span class="se">\
</span></span></span><span class="line"><span class="cl">  -H <span class="s2">&#34;Content-Type: application/json&#34;</span> <span class="se">\
</span></span></span><span class="line"><span class="cl">  -d <span class="s1">&#39;{
</span></span></span><span class="line"><span class="cl"><span class="s1">    &#34;vectors&#34;: {
</span></span></span><span class="line"><span class="cl"><span class="s1">      &#34;size&#34;: 384,
</span></span></span><span class="line"><span class="cl"><span class="s1">      &#34;distance&#34;: &#34;Cosine&#34;
</span></span></span><span class="line"><span class="cl"><span class="s1">    }
</span></span></span><span class="line"><span class="cl"><span class="s1">  }&#39;</span>
</span></span></code></pre></div><p>Параметры:</p>
<ul>
<li><code>size: 384</code> &ndash; размерность вектора (зависит от модели эмбеддинга, all-MiniLM = 384)</li>
<li><code>distance: &quot;Cosine&quot;</code> &ndash; метрика сравнения (косинусная близость)</li>
</ul>
<p>Проверяем:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-bash" data-lang="bash"><span class="line"><span class="cl">curl -s http://localhost:6333/collections/demo <span class="p">|</span> python3 -m json.tool
</span></span></code></pre></div><h3 id="шаг-3-добавляем-данные-upsert">Шаг 3. Добавляем данные (upsert)</h3>
<p>384 числа вручную писать не нужно &ndash; используем Python-скрипт, который генерирует демо-вектора (в реальности их создаёт модель эмбеддинга):</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-python" data-lang="python"><span class="line"><span class="cl"><span class="ch">#!/usr/bin/env python3</span>
</span></span><span class="line"><span class="cl"><span class="c1"># demo-upsert.py — добавляем точки в Qdrant</span>
</span></span><span class="line"><span class="cl"><span class="kn">import</span> <span class="nn">requests</span><span class="o">,</span> <span class="nn">random</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="n">QDRANT</span> <span class="o">=</span> <span class="s2">&#34;http://localhost:6333&#34;</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="n">random</span><span class="o">.</span><span class="n">seed</span><span class="p">(</span><span class="mi">42</span><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="n">points</span> <span class="o">=</span> <span class="p">[</span>
</span></span><span class="line"><span class="cl">    <span class="p">{</span>
</span></span><span class="line"><span class="cl">        <span class="s2">&#34;id&#34;</span><span class="p">:</span> <span class="mi">1</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">        <span class="s2">&#34;vector&#34;</span><span class="p">:</span> <span class="p">[</span><span class="n">random</span><span class="o">.</span><span class="n">uniform</span><span class="p">(</span><span class="o">-</span><span class="mi">1</span><span class="p">,</span> <span class="mi">1</span><span class="p">)</span> <span class="k">for</span> <span class="n">_</span> <span class="ow">in</span> <span class="nb">range</span><span class="p">(</span><span class="mi">384</span><span class="p">)],</span>
</span></span><span class="line"><span class="cl">        <span class="s2">&#34;payload&#34;</span><span class="p">:</span> <span class="p">{</span>
</span></span><span class="line"><span class="cl">            <span class="s2">&#34;text&#34;</span><span class="p">:</span> <span class="s2">&#34;Для reverse proxy в nginx используйте proxy_pass&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">            <span class="s2">&#34;source&#34;</span><span class="p">:</span> <span class="s2">&#34;nginx-guide.md&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">            <span class="s2">&#34;topic&#34;</span><span class="p">:</span> <span class="s2">&#34;nginx&#34;</span>
</span></span><span class="line"><span class="cl">        <span class="p">}</span>
</span></span><span class="line"><span class="cl">    <span class="p">},</span>
</span></span><span class="line"><span class="cl">    <span class="p">{</span>
</span></span><span class="line"><span class="cl">        <span class="s2">&#34;id&#34;</span><span class="p">:</span> <span class="mi">2</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">        <span class="s2">&#34;vector&#34;</span><span class="p">:</span> <span class="p">[</span><span class="n">random</span><span class="o">.</span><span class="n">uniform</span><span class="p">(</span><span class="o">-</span><span class="mi">1</span><span class="p">,</span> <span class="mi">1</span><span class="p">)</span> <span class="k">for</span> <span class="n">_</span> <span class="ow">in</span> <span class="nb">range</span><span class="p">(</span><span class="mi">384</span><span class="p">)],</span>
</span></span><span class="line"><span class="cl">        <span class="s2">&#34;payload&#34;</span><span class="p">:</span> <span class="p">{</span>
</span></span><span class="line"><span class="cl">            <span class="s2">&#34;text&#34;</span><span class="p">:</span> <span class="s2">&#34;Docker Compose описывает многоконтейнерное приложение в YAML&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">            <span class="s2">&#34;source&#34;</span><span class="p">:</span> <span class="s2">&#34;docker-guide.md&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">            <span class="s2">&#34;topic&#34;</span><span class="p">:</span> <span class="s2">&#34;docker&#34;</span>
</span></span><span class="line"><span class="cl">        <span class="p">}</span>
</span></span><span class="line"><span class="cl">    <span class="p">}</span>
</span></span><span class="line"><span class="cl"><span class="p">]</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="n">resp</span> <span class="o">=</span> <span class="n">requests</span><span class="o">.</span><span class="n">put</span><span class="p">(</span><span class="sa">f</span><span class="s2">&#34;</span><span class="si">{</span><span class="n">QDRANT</span><span class="si">}</span><span class="s2">/collections/demo/points&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">                    <span class="n">json</span><span class="o">=</span><span class="p">{</span><span class="s2">&#34;points&#34;</span><span class="p">:</span> <span class="n">points</span><span class="p">})</span>
</span></span><span class="line"><span class="cl"><span class="nb">print</span><span class="p">(</span><span class="n">resp</span><span class="o">.</span><span class="n">json</span><span class="p">())</span>
</span></span><span class="line"><span class="cl"><span class="c1"># {&#34;result&#34;:{&#34;operation_id&#34;:0,&#34;status&#34;:&#34;completed&#34;},...}</span>
</span></span></code></pre></div><div class="highlight"><pre tabindex="0" class="chroma"><code class="language-bash" data-lang="bash"><span class="line"><span class="cl">pip install requests
</span></span><span class="line"><span class="cl">python3 demo-upsert.py
</span></span></code></pre></div><p><code>upsert</code> &ndash; если точка с таким ID существует, обновит; если нет &ndash; создаст. Идемпотентная операция.</p>
<h3 id="шаг-4-семантический-поиск">Шаг 4. Семантический поиск</h3>
<p>Ищем точку, ближайшую к нашему запросу:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-python" data-lang="python"><span class="line"><span class="cl"><span class="ch">#!/usr/bin/env python3</span>
</span></span><span class="line"><span class="cl"><span class="c1"># demo-search.py — семантический поиск в Qdrant</span>
</span></span><span class="line"><span class="cl"><span class="kn">import</span> <span class="nn">requests</span><span class="o">,</span> <span class="nn">random</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="n">QDRANT</span> <span class="o">=</span> <span class="s2">&#34;http://localhost:6333&#34;</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="c1"># Для демо: используем вектор, идентичный точке 1 (nginx)</span>
</span></span><span class="line"><span class="cl"><span class="c1"># В реальности вектор запроса создаёт модель эмбеддинга</span>
</span></span><span class="line"><span class="cl"><span class="n">random</span><span class="o">.</span><span class="n">seed</span><span class="p">(</span><span class="mi">42</span><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="n">query_vector</span> <span class="o">=</span> <span class="p">[</span><span class="n">random</span><span class="o">.</span><span class="n">uniform</span><span class="p">(</span><span class="o">-</span><span class="mi">1</span><span class="p">,</span> <span class="mi">1</span><span class="p">)</span> <span class="k">for</span> <span class="n">_</span> <span class="ow">in</span> <span class="nb">range</span><span class="p">(</span><span class="mi">384</span><span class="p">)]</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="n">resp</span> <span class="o">=</span> <span class="n">requests</span><span class="o">.</span><span class="n">post</span><span class="p">(</span><span class="sa">f</span><span class="s2">&#34;</span><span class="si">{</span><span class="n">QDRANT</span><span class="si">}</span><span class="s2">/collections/demo/points/search&#34;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">                     <span class="n">json</span><span class="o">=</span><span class="p">{</span>
</span></span><span class="line"><span class="cl">                         <span class="s2">&#34;vector&#34;</span><span class="p">:</span> <span class="n">query_vector</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">                         <span class="s2">&#34;limit&#34;</span><span class="p">:</span> <span class="mi">2</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">                         <span class="s2">&#34;with_payload&#34;</span><span class="p">:</span> <span class="kc">True</span>
</span></span><span class="line"><span class="cl">                     <span class="p">})</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="k">for</span> <span class="n">hit</span> <span class="ow">in</span> <span class="n">resp</span><span class="o">.</span><span class="n">json</span><span class="p">()[</span><span class="s2">&#34;result&#34;</span><span class="p">]:</span>
</span></span><span class="line"><span class="cl">    <span class="n">score</span> <span class="o">=</span> <span class="n">hit</span><span class="p">[</span><span class="s2">&#34;score&#34;</span><span class="p">]</span>
</span></span><span class="line"><span class="cl">    <span class="n">text</span> <span class="o">=</span> <span class="n">hit</span><span class="p">[</span><span class="s2">&#34;payload&#34;</span><span class="p">][</span><span class="s2">&#34;text&#34;</span><span class="p">]</span>
</span></span><span class="line"><span class="cl">    <span class="n">source</span> <span class="o">=</span> <span class="n">hit</span><span class="p">[</span><span class="s2">&#34;payload&#34;</span><span class="p">][</span><span class="s2">&#34;source&#34;</span><span class="p">]</span>
</span></span><span class="line"><span class="cl">    <span class="nb">print</span><span class="p">(</span><span class="sa">f</span><span class="s2">&#34;  [</span><span class="si">{</span><span class="n">score</span><span class="si">:</span><span class="s2">.4f</span><span class="si">}</span><span class="s2">] </span><span class="si">{</span><span class="n">source</span><span class="si">}</span><span class="s2">: </span><span class="si">{</span><span class="n">text</span><span class="si">}</span><span class="s2">&#34;</span><span class="p">)</span>
</span></span></code></pre></div><div class="highlight"><pre tabindex="0" class="chroma"><code class="language-bash" data-lang="bash"><span class="line"><span class="cl">python3 demo-search.py
</span></span><span class="line"><span class="cl"><span class="c1">#   [1.0000] nginx-guide.md: Для reverse proxy в nginx используйте proxy_pass</span>
</span></span><span class="line"><span class="cl"><span class="c1">#   [0.0042] docker-guide.md: Docker Compose описывает многоконтейнерное...</span>
</span></span></code></pre></div><p>Точка с nginx получила score 1.0 (идеальное совпадение &ndash; мы искали тем же вектором). Docker получил почти 0 &ndash; совсем другой смысл.</p>
<h3 id="шаг-5-реальный-семантический-поиск-с-ollama">Шаг 5. Реальный семантический поиск (с Ollama)</h3>
<p>В реальном pipeline вектора создаёт модель эмбеддинга. Вот как это работает с Ollama:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-bash" data-lang="bash"><span class="line"><span class="cl"><span class="c1"># Скачиваем модель эмбеддинга</span>
</span></span><span class="line"><span class="cl">ollama pull all-minilm
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="c1"># Получаем вектор запроса</span>
</span></span><span class="line"><span class="cl">curl -s http://localhost:11434/api/embed <span class="se">\
</span></span></span><span class="line"><span class="cl">  -d <span class="s1">&#39;{&#34;model&#34;:&#34;all-minilm&#34;,&#34;input&#34;:&#34;как настроить reverse proxy&#34;}&#39;</span> <span class="se">\
</span></span></span><span class="line"><span class="cl">  <span class="p">|</span> python3 -c <span class="s2">&#34;
</span></span></span><span class="line"><span class="cl"><span class="s2">import sys, json
</span></span></span><span class="line"><span class="cl"><span class="s2">emb = json.load(sys.stdin)[&#39;embeddings&#39;][0]
</span></span></span><span class="line"><span class="cl"><span class="s2">print(f&#39;Размерность: {len(emb)}&#39;)
</span></span></span><span class="line"><span class="cl"><span class="s2">print(f&#39;Первые 5 чисел: {[round(x,4) for x in emb[:5]]}&#39;)
</span></span></span><span class="line"><span class="cl"><span class="s2">&#34;</span>
</span></span><span class="line"><span class="cl"><span class="c1"># Размерность: 384</span>
</span></span><span class="line"><span class="cl"><span class="c1"># Первые 5 чисел: [-0.0312, 0.0891, -0.0456, 0.1234, -0.0678]</span>
</span></span></code></pre></div><p>Тот же текст всегда даёт тот же вектор. Похожие тексты дают похожие вектора. На этом и строится семантический поиск.</p>
<hr>
<h2 id="под-капотом-как-работает-rag-поиск">Под капотом: как работает RAG-поиск</h2>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-fallback" data-lang="fallback"><span class="line"><span class="cl">Запрос пользователя          База знаний (Qdrant)
</span></span><span class="line"><span class="cl">&#34;как настроить proxy&#34;         ┌──────────────────┐
</span></span><span class="line"><span class="cl">        │                     │ nginx-guide.md   │→ [0.12, -0.34, ...]
</span></span><span class="line"><span class="cl">        ▼                     │ docker-guide.md  │→ [0.89, 0.45, ...]
</span></span><span class="line"><span class="cl"> Embedding Model              │ ssh-guide.md     │→ [-0.56, 0.23, ...]
</span></span><span class="line"><span class="cl"> (all-MiniLM)                 └──────────────────┘
</span></span><span class="line"><span class="cl">        │                              │
</span></span><span class="line"><span class="cl">        ▼                              │
</span></span><span class="line"><span class="cl"> [0.11, -0.31, ...]       косинусная близость
</span></span><span class="line"><span class="cl">        │                              │
</span></span><span class="line"><span class="cl">        └──────────────────────────────┘
</span></span><span class="line"><span class="cl">                    │
</span></span><span class="line"><span class="cl">                    ▼
</span></span><span class="line"><span class="cl">            Ранжирование:
</span></span><span class="line"><span class="cl">            1. nginx-guide.md  → 0.94
</span></span><span class="line"><span class="cl">            2. docker-guide.md → 0.67
</span></span><span class="line"><span class="cl">            3. ssh-guide.md    → 0.23
</span></span><span class="line"><span class="cl">                    │
</span></span><span class="line"><span class="cl">                    ▼
</span></span><span class="line"><span class="cl">            Top-K результатов → в контекст LLM
</span></span></code></pre></div><p>Ключевые этапы:</p>
<ol>
<li><strong>Векторизация (embedding)</strong> &ndash; текст запроса превращается в вектор той же моделью, которой индексировалась база</li>
<li><strong>Векторный поиск (vector search)</strong> &ndash; Qdrant ищет ближайшие точки по косинусной близости (алгоритм HNSW, логарифмическая сложность)</li>
<li><strong>Ранжирование (ranking)</strong> &ndash; результаты сортируются по оценке релевантности (score)</li>
<li><strong>Подстановка контекста (context injection)</strong> &ndash; лучшие K результатов вставляются в промпт LLM вместе с метаданными</li>
</ol>
<p>Важное правило: <strong>одна модель эмбеддинга для индексации и поиска</strong>. Если индексировали через <code>all-minilm</code>, искать тоже через <code>all-minilm</code>. Разные модели дают несовместимые вектора.</p>
<hr>
<h2 id="мини-тест">Мини-тест</h2>
<p><strong>1. Почему PostgreSQL с <code>LIKE '%proxy%'</code> не заменяет векторный поиск?</strong></p>
<details>
<summary>Ответ</summary>
<p><code>LIKE</code> ищет по точному совпадению подстроки. Запрос &ldquo;настройка проксирования&rdquo; не найдёт документ со словом &ldquo;proxy&rdquo;. Векторный поиск сравнивает смысл, а не буквы &ndash; косинусная близость между семантически похожими текстами будет высокой независимо от конкретных слов.</p>
</details>
<p><strong>2. Коллекция создана с <code>size: 384</code>. Можно ли добавить вектор длиной 768?</strong></p>
<details>
<summary>Ответ</summary>
<p>Нет. Размерность задаётся при создании коллекции и должна совпадать с выходом модели эмбеддинга. all-MiniLM даёт 384, mxbai-embed-large &ndash; 1024, nomic-embed-text &ndash; 768. Несовпадение = ошибка при upsert.</p>
</details>
<p><strong>3. Score = 0.95. Score = 0.45. Что это значит для качества поиска?</strong></p>
<details>
<summary>Ответ</summary>
<p>При косинусной близости (cosine similarity) 0.95 &ndash; высокая семантическая схожесть, фрагмент почти наверняка релевантен запросу. 0.45 &ndash; слабое совпадение, фрагмент скорее всего не о том. На практике порог отсечения обычно 0.7-0.8.</p>
</details>
<p><strong>4. Что такое payload в Qdrant и зачем он нужен в RAG?</strong></p>
<details>
<summary>Ответ</summary>
<p>Payload &ndash; произвольные данные, прикреплённые к вектору (текст, путь к файлу, номера строк, теги). В RAG payload позволяет LLM не только найти релевантный кусок, но и сослаться на источник: &ldquo;Согласно файлу nginx-guide.md, строки 45-89&hellip;&rdquo;</p>
</details>
<hr>
<h2 id="артефакт-docker-composeyml">Артефакт: docker-compose.yml</h2>
<p>Готовый файл для запуска Qdrant с персистентным хранилищем:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-yaml" data-lang="yaml"><span class="line"><span class="cl"><span class="c"># docker-compose.yml — Qdrant для RAG Pipeline</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="nt">services</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">qdrant</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nt">image</span><span class="p">:</span><span class="w"> </span><span class="l">qdrant/qdrant:latest</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nt">container_name</span><span class="p">:</span><span class="w"> </span><span class="l">qdrant</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nt">ports</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span>- <span class="s2">&#34;6333:6333&#34;</span><span class="w">   </span><span class="c"># REST API + Dashboard</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span>- <span class="s2">&#34;6334:6334&#34;</span><span class="w">   </span><span class="c"># gRPC (для Python SDK)</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nt">volumes</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span>- <span class="l">qdrant-data:/qdrant/storage</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nt">environment</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span>- <span class="l">QDRANT__SERVICE__GRPC_PORT=6334</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nt">restart</span><span class="p">:</span><span class="w"> </span><span class="l">unless-stopped</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nt">healthcheck</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span><span class="nt">test</span><span class="p">:</span><span class="w"> </span><span class="p">[</span><span class="s2">&#34;CMD&#34;</span><span class="p">,</span><span class="w"> </span><span class="s2">&#34;curl&#34;</span><span class="p">,</span><span class="w"> </span><span class="s2">&#34;-f&#34;</span><span class="p">,</span><span class="w"> </span><span class="s2">&#34;http://localhost:6333/healthz&#34;</span><span class="p">]</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span><span class="nt">interval</span><span class="p">:</span><span class="w"> </span><span class="l">30s</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span><span class="nt">timeout</span><span class="p">:</span><span class="w"> </span><span class="l">10s</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span><span class="nt">retries</span><span class="p">:</span><span class="w"> </span><span class="m">3</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span><span class="nt">start_period</span><span class="p">:</span><span class="w"> </span><span class="l">10s</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="nt">volumes</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">qdrant-data</span><span class="p">:</span><span class="w">
</span></span></span></code></pre></div><p>Скрипт проверки:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-bash" data-lang="bash"><span class="line"><span class="cl"><span class="cp">#!/bin/bash
</span></span></span><span class="line"><span class="cl"><span class="c1"># check-qdrant.sh — проверка что Qdrant работает</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="nb">set</span> -euo pipefail
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="nv">QDRANT_URL</span><span class="o">=</span><span class="s2">&#34;</span><span class="si">${</span><span class="nv">QDRANT_URL</span><span class="k">:-</span><span class="nv">http</span><span class="p">://localhost:</span><span class="nv">6333</span><span class="si">}</span><span class="s2">&#34;</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="nb">echo</span> <span class="s2">&#34;=== Qdrant Health Check ===&#34;</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="c1"># 1. Health</span>
</span></span><span class="line"><span class="cl"><span class="nv">HEALTH</span><span class="o">=</span><span class="k">$(</span>curl -sf <span class="s2">&#34;</span><span class="nv">$QDRANT_URL</span><span class="s2">/healthz&#34;</span> 2&gt;/dev/null <span class="o">||</span> <span class="nb">echo</span> <span class="s2">&#34;FAIL&#34;</span><span class="k">)</span>
</span></span><span class="line"><span class="cl"><span class="k">if</span> <span class="o">[</span> <span class="s2">&#34;</span><span class="nv">$HEALTH</span><span class="s2">&#34;</span> <span class="o">=</span> <span class="s2">&#34;ok&#34;</span> <span class="o">]</span><span class="p">;</span> <span class="k">then</span>
</span></span><span class="line"><span class="cl">    <span class="nb">echo</span> <span class="s2">&#34;[OK] Qdrant is healthy&#34;</span>
</span></span><span class="line"><span class="cl"><span class="k">else</span>
</span></span><span class="line"><span class="cl">    <span class="nb">echo</span> <span class="s2">&#34;[FAIL] Qdrant is not responding at </span><span class="nv">$QDRANT_URL</span><span class="s2">&#34;</span>
</span></span><span class="line"><span class="cl">    <span class="nb">exit</span> <span class="m">1</span>
</span></span><span class="line"><span class="cl"><span class="k">fi</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="c1"># 2. Collections</span>
</span></span><span class="line"><span class="cl">curl -sf <span class="s2">&#34;</span><span class="nv">$QDRANT_URL</span><span class="s2">/collections&#34;</span> <span class="p">|</span> python3 -c <span class="s2">&#34;
</span></span></span><span class="line"><span class="cl"><span class="s2">import sys, json
</span></span></span><span class="line"><span class="cl"><span class="s2">data = json.load(sys.stdin)
</span></span></span><span class="line"><span class="cl"><span class="s2">cols = data.get(&#39;result&#39;, {}).get(&#39;collections&#39;, [])
</span></span></span><span class="line"><span class="cl"><span class="s2">if cols:
</span></span></span><span class="line"><span class="cl"><span class="s2">    for c in cols:
</span></span></span><span class="line"><span class="cl"><span class="s2">        print(f\&#34;  {c[&#39;name&#39;]}\&#34;)
</span></span></span><span class="line"><span class="cl"><span class="s2">    print(f&#39;Total collections: {len(cols)}&#39;)
</span></span></span><span class="line"><span class="cl"><span class="s2">else:
</span></span></span><span class="line"><span class="cl"><span class="s2">    print(&#39;[INFO] No collections yet&#39;)
</span></span></span><span class="line"><span class="cl"><span class="s2">&#34;</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="nb">echo</span> <span class="s2">&#34;=== Done ===&#34;</span>
</span></span></code></pre></div><p>Запуск:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-bash" data-lang="bash"><span class="line"><span class="cl">docker compose up -d
</span></span><span class="line"><span class="cl">chmod +x check-qdrant.sh
</span></span><span class="line"><span class="cl">./check-qdrant.sh
</span></span></code></pre></div><hr>
<h2 id="продакшен-параметры">Продакшен-параметры</h2>
<p>Для справки &ndash; параметры нашего рабочего RAG pipeline:</p>
<table>
  <thead>
      <tr>
          <th>Параметр</th>
          <th>Значение</th>
          <th>Почему</th>
      </tr>
  </thead>
  <tbody>
      <tr>
          <td>Коллекция</td>
          <td><code>student_knowledge</code></td>
          <td>Отдельная от тестовой</td>
      </tr>
      <tr>
          <td>Размерность</td>
          <td>384 (all-MiniLM)</td>
          <td>Быстрый embed, достаточное качество</td>
      </tr>
      <tr>
          <td>Distance</td>
          <td>Cosine</td>
          <td>Стандарт для текстового поиска</td>
      </tr>
      <tr>
          <td>Чанкинг</td>
          <td>20 строк, split по def/class</td>
          <td>Один логический блок</td>
      </tr>
      <tr>
          <td>Truncate</td>
          <td>250 символов перед embed</td>
          <td>all-MiniLM не обрабатывает длинный текст</td>
      </tr>
      <tr>
          <td>Объём</td>
          <td>16,446 фрагментов текста (чанков), 16 недель</td>
          <td>Полная база знаний курса</td>
      </tr>
      <tr>
          <td>Синхронизация</td>
          <td>systemd timer, каждые 10 мин</td>
          <td>Git diff → re-index только изменённые</td>
      </tr>
  </tbody>
</table>
<hr>
<h2 id="что-дальше">Что дальше</h2>
<p>Это первый пост серии <strong>RAG Pipeline</strong>. Qdrant &ndash; хранилище. Но чтобы pipeline заработал, нужны ещё два компонента:</p>
<ul>
<li><strong>RAG Pipeline 2/N &ndash; Embeddings: Ollama vs API</strong> &ndash; как превращать текст в вектора, выбор модели, batch vs single, подводные камни с русским текстом</li>
<li><strong>RAG Pipeline 3/N &ndash; Chunking</strong> &ndash; размер чанка, overlap, split по границам функций, metadata для цитирования</li>
</ul>
<hr>
<p>Telegram: <a href="https://t.me/DevITWay">@DevITWay</a>
Сайт: <a href="https://devopsway.ru/">devopsway.ru</a></p>
]]></content:encoded>
    </item>
  </channel>
</rss>
