pupil.db package
pupil.db.database module
- class pupil.db.DataBase(vecdb: pupil.db.vector.VectorDB, metadb: pupil.db.meta.MetaDataDB)
Bases:
object- add(metadata: Any, embeddings: NDArray2D)
- embbeding_search(embeddings: NDArray2D, n_results: int) NDArray2D
- classmethod from_encoder(data_schema: pupil.types.IsDataclass, label_col_name: str, data_col_name: str, data: pandas.core.frame.DataFrame, encoder: Callable[[...], NDArray2D])
- get(i: Union[int, List[int], NDArray[Any, Int32]], return_embeddings: bool = False, return_metadata: bool = True) GetDatabase
pupil.db.meta module
- class pupil.db.PandasDB(schema: pupil.types.IsDataclass, label: str)
Bases:
object- add(data: pandas.core.frame.DataFrame) None
Add DataFrame to the database. It should contain the label column you passed to create the instance.
- Parameters
data (pd.DataFrame) –
- filter(field: str, value: Any) List[pupil.types.IsDataclass]
Filter data base on columns and values
- Parameters
field (str) – Name of the column
value (str) – Value to filter on
- Raises
ValueError – if field not in the columns
- Returns
List of schema objects
- Return type
List[IsDataclass]
- get(index: Union[int, Iterable[int]]) List[pupil.types.IsDataclass]
Get data from DataFram
- Parameters
index (Union[int, Iterable[int]]) – Row number
- Returns
List of schema objects
- Return type
List[IsDataclass]
- property is_labeled
- set_label(i: int, input: Any) None
pupil.db.vector module
- class pupil.db.FaissVectorDB(similarity_metric: pupil.types.Distance = Distance.COSINE, nlist: Optional[int] = None, nprobe: Optional[int] = 5)
Bases:
object- add(embeddings: NDArray2D) None
Add embeddings into the database
- Parameters
embeddings (NDArray2D) – _description_
- build_index(embeddings: NDArray2D) None
- search(query: NDArray2D, n_results: int = 4) Tuple[NDArray2D, NDArray2D]
Search embeddings to get the closest embeddings to the queries.
- Parameters
query (NDArray2D) – Vectors to search
n_results (int, optional) – Number of results per query. Defaults to 4.
- Returns
Return (Distances, indices)
- Return type
Tuple[NDArray2D, NDArray2D]