联 系 我 们
售前咨询
售后咨询
微信关注:星环科技服务号
更多联系方式 >
5.15 稀疏浮点向量
更新时间:8/5/2024, 6:58:36 AM

Hippo支持稀疏浮点向量,距离度量算法为内积 (IP)。

建表

curl -u shiva:shiva -XPUT 'localhost:8902/hippo/v1/{table}?pretty' -H 'Content-Type: application/json' -d'{
  "settings": {
    "number_of_shards" : 1,
    "number_of_replicas" : 1
  },
  "schema": {
    "auto_id": false,
    "fields": [
      {
        "name": "book_id",
        "is_primary_key": true,
        "data_type": "int64"
      },
      {
        "name": "word_count",
        "is_primary_key": false,
        "data_type": "int64"
      },
      {
        "name": "book_intro",
        "data_type": "sparse_float_vector",
        "is_primary_key": false
      }
    ]
  }
}';
复制

返回结果:

{
  "acknowledged" : true
}
复制

写入

curl -u shiva:shiva -XPUT 'localhost:8902/hippo/v1/{table}/_bulk?pretty' -H 'Content-Type: application/json' -d'{
  "fields_data": [
    {
      "field_name": "book_id",
      "field": [
1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59,60,61,62,63,64,65,66,67,68,69,70,71,72,73,74,75,76,77,78,79,80,81,82,83,84,85,86,87,88,89,90,91,92,93,94,95,96,97,98,99,100
      ]
    },
    {
      "field_name": "word_count",
      "field": [ 1000,2000,3000,4000,5000,6000,7000,8000,9000,10000,11000,12000,13000,14000,15000,16000,17000,18000,19000,20000,21000,22000,23000,24000,25000,26000,27000,28000,29000,30000,31000,32000,33000,34000,35000,36000,37000,38000,39000,40000,41000,42000,43000,44000,45000,46000,47000,48000,49000,50000,51000,52000,53000,54000,55000,56000,57000,58000,59000,60000,61000,62000,63000,64000,65000,66000,67000,68000,69000,70000,71000,72000,73000,74000,75000,76000,77000,78000,79000,80000,81000,82000,83000,84000,85000,86000,87000,88000,89000,90000,91000,92000,93000,94000,95000,96000,97000,98000,99000,100000
      ]
    },
    {
      "field_name": "book_intro",
      "field": [
        ["1:1", "1:1"],["2:1", "1:2"],["3:1", "1:3"],["4:1", "1:4"],["5:1", "1:5"],["6:1", "1:6"],["7:1", "1:7"],["8:1", "1:8"],["9:1", "1:9"],["10:1", "1:10"],["11:1", "1:11"],["12:1", "1:12"],["13:1", "1:13"],["14:1", "1:14"],["15:1", "1:15"],["16:1", "1:16"],["17:1", "1:17"],["18:1", "1:18"],["19:1", "1:19"],["20:1", "1:20"],["21:1", "1:21"],["22:1", "1:22"],["23:1", "1:23"],["24:1", "1:24"],["25:1", "1:25"],["26:1", "1:26"],["27:1", "1:27"],["28:1", "1:28"],["29:1", "1:29"],["30:1", "1:30"],["31:1", "1:31"],["32:1", "1:32"],["33:1", "1:33"],["34:1", "1:34"],["35:1", "1:35"],["36:1", "1:36"],["37:1", "1:37"],["38:1", "1:38"],["39:1", "1:39"],["40:1", "1:40"],["41:1", "1:41"],["42:1", "1:42"],["43:1", "1:43"],["44:1", "1:44"],["45:1", "1:45"],["46:1", "1:46"],["47:1", "1:47"],["48:1", "1:48"],["49:1", "1:49"],["50:1", "1:50"],["51:1", "1:51"],["52:1", "1:52"],["53:1", "1:53"],["54:1", "1:54"],["55:1", "1:55"],["56:1", "1:56"],["57:1", "1:57"],["58:1", "1:58"],["59:1", "1:59"],["60:1", "1:60"],["61:1", "1:61"],["62:1", "1:62"],["63:1", "1:63"],["64:1", "1:64"],["65:1", "1:65"],["66:1", "1:66"],["67:1", "1:67"],["68:1", "1:68"],["69:1", "1:69"],["70:1", "1:70"],["71:1", "1:71"],["72:1", "1:72"],["73:1", "1:73"],["74:1", "1:74"],["75:1", "1:75"],["76:1", "1:76"],["77:1", "1:77"],["78:1", "1:78"],["79:1", "1:79"],["80:1", "1:80"],["81:1", "1:81"],["82:1", "1:82"],["83:1", "1:83"],["84:1", "1:84"],["85:1", "1:85"],["86:1", "1:86"],["87:1", "1:87"],["88:1", "1:88"],["89:1", "1:89"],["90:1", "1:90"],["91:1", "1:91"],["92:1", "1:92"],["93:1", "1:93"],["94:1", "1:94"],["95:1", "1:95"],["96:1", "1:96"],["97:1", "1:97"],["98:1", "1:98"],["99:1", "1:99"],["100:1", "1:100"]
      ]
    }
  ],
  "num_rows": 100,
  "op_type": "insert"
}';
复制

返回结果:

{
  "succ_index" : [
    0,
    1,
    2,
    3,
    …,
    97,
    98,
    99
  ],
  "insert_cnt" : 100
}
复制

创建向量索引

创建向量索引示例如下,向量索引仅支持稀疏图 (SPARSE_HNSW)。

curl -u shiva:shiva -XPUT 'localhost:8902/hippo/v1/{table}/_create_embedding_index?pretty' -H 'Content-Type: application/json' -d'{
  "field_name" : "book_intro",
  "index_name" : "sparse_index",
  "metric_type" : "ip",
  "index_type": "SPARSE_HNSW",
  "params": {
    "M" : 8,
    "ef_construction" : 64,
    "norm_bm25" : false
  }
}';
复制

其中norm_bm25表示是否自动对稀疏向量做bm25归一化,默认为true。当稀疏向量来自splade时,应该设置该选项为false。

返回结果:

{
  "acknowledged" : true
}
复制

向量搜索

向量激活后,可进行搜索,示例如下:

curl -u shiva:shiva -XGET 'localhost:8902/hippo/v1/{table}/_search?pretty' -H 'Content-Type: application/json' -d'{
  "output_fields": ["book_id"],
  "search_params": {
    "anns_field": "book_intro",
    "topk": 2,
    "params": {
      "ef_search": 10
    },
    "embedding_index": "sparse_index"
  },
  "sparse_vectors": [ ["1:1", "1:1"], ["2:1", "3:1"] ],
  "round_decimal": 2,
  "only_explain" : false
}';
复制

返回结果:

{
  "num_queries" : 2,
  "top_k" : 2,
  "results" : [
    {
      "query" : 0,
      "fields_data" : [
        {
          "field_name" : "book_id",
          "field_values" : [
            100,
            99
          ]
        }
      ],
      "scores" : [
        100.0,
        99.0
      ]
    },
    {
      "query" : 1,
      "fields_data" : [
        {
          "field_name" : "book_id",
          "field_values" : [
            3,
            2
          ]
        }
      ],
      "scores" : [
        1.0,
        1.0
      ]
    }
  ]
}
复制