Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
65 changes: 64 additions & 1 deletion docs/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -16,11 +16,74 @@ pip install impresso

The library requires Python version `3.10` or higher. It also depends on several packages commonly found in Jupyter environments, such as `matplotlib` and `pandas`.

## At a glance

### Create a session

```python
from impresso import connect
client = connect()
```

### Search

```python
results = client.search.find(term="moon landing")
results
```

`results` will display a summary of the result including a preview of a pandas data frame with the result data. Use `df` property to access the full data frame:

```python
results.df
```
### Pagination

!!! warning "Monthly Quota"
Every Impresso user has a monthly quota of the content items they can access.
The quota is currently set at 200,000 content items. Paginating through a
large result set may see you hitting the quota limit fairly soon.
Make sure to check the size of the full result set before fetching all pages.

By default every result object is the first page of the full result set. Use the following code to go through the rest of the pages:

```python
import pandas as pd
# Get first page with 100 items per page
results = impresso.search.find(term="landing", limit=100)
print(f"Full result contains {results.total} items.")

full_df = results.df

# Iterate through all pages
for page in results.pages():
full_df = pd.concat([full_df, page.df])

full_df
```

### Accessing transcripts

Content item transcripts can be large and are not returned by default.
To access a transcript, request it by content item ID:

```python
result = client.content_items.get("NZG-1877-10-20-a-i0024")
result.df['text.content'][0]
```

### See content item on Web App (shortcut)
To see a specific content item in the Web App, look for the link "See this result in the Impresso App" in the rendered result summary:

```python
result = client.content_items.get("NZG-1877-10-20-a-i0024")
result
```

## Create a session

::: impresso.connect


## About Impresso

### Impresso project
Expand Down
4 changes: 4 additions & 0 deletions docs/resources.md
Original file line number Diff line number Diff line change
Expand Up @@ -29,6 +29,9 @@ impresso.search.facet(facet='newspaper', term='war')
::: impresso.resources.search.SearchResource

::: impresso.api_client.models.search_order_by.SearchOrderByLiteral
::: impresso.api_client.models.content_item_access_rights_copyright.ContentItemAccessRightsCopyrightLiteral
::: impresso.resources.tools.Embedding

::: impresso.resources.search.SearchDataContainer

## Entities
Expand Down Expand Up @@ -72,6 +75,7 @@ impresso.media_sources.find(

::: impresso.resources.media_sources.MediaSourcesResource

::: impresso.api_client.models.find_media_sources_type.FindMediaSourcesTypeLiteral
::: impresso.api_client.models.find_media_sources_order_by.FindMediaSourcesOrderByLiteral
::: impresso.resources.media_sources.FindMediaSourcesContainer

Expand Down
Loading