Sessions & Authentication¶
Silkweb sessions persist browser state (cookies, localStorage, sessionStorage) across requests, enabling scraping of authenticated pages.
Creating a session¶
Record a session interactively¶
Open a real browser, log in manually, and Silkweb saves the session state.
record_session is async (it drives Playwright). Call it from an async entrypoint or asyncio.run:
import asyncio
import silkweb
async def main():
# Opens a visible browser window — log in, navigate, etc.
# Session is saved when you close the browser.
await silkweb.record_session("my-site")
asyncio.run(main())
The session is stored at ~/.silkweb/sessions/my-site.silkweb.
Session file format¶
The .silkweb file is JSON containing:
{
"cookies": [...],
"localStorage": {...},
"url": "https://example.com/dashboard",
"created_at": "2024-01-15T10:30:00Z",
"ua": "Mozilla/5.0 ...",
"actions": [...]
}
Using a session¶
Fetch with persisted auth¶
import asyncio
from silkweb.session.session import SilkSession
async def main():
session = SilkSession("my-site") # loads from disk
page = await session.fetch("https://example.com/dashboard")
# Cookies and localStorage are automatically applied.
#
# NOTE: `SilkSession.fetch()` returns a Playwright `Page`, not a `SilkPage`,
# so use Playwright methods to read content:
html = await page.content()
print(html[:500])
asyncio.run(main())
Interact with pages¶
import asyncio
from silkweb.session.session import SilkSession
async def main():
session = SilkSession("my-site")
# Fill a search box
await session.fill("input[name='query']", "python")
# Click a button
await session.click("button[type='submit']")
# Wait for results
await session.wait_for(".results-list", timeout=10000)
# Fetch the resulting page
page = await session.fetch("https://example.com/search?q=python")
asyncio.run(main())
Save and reload¶
import asyncio
from silkweb.session.session import SilkSession
async def persist():
session = SilkSession("my-site")
# ... interact with pages ...
await session.save() # persist updated cookies/storage
asyncio.run(persist())
# Later:
session = SilkSession.load("my-site")
Replay a session¶
Replay recorded actions in headless mode. replay_session is async (same as record_session):
import asyncio
import silkweb
async def main():
await silkweb.replay_session("my-site")
asyncio.run(main())
This plays back all recorded navigations, form fills, and clicks without opening a visible browser.
replay vs replay_session
replay_session(name)— Replays a browser session saved under~/.silkweb/sessions/<name>.silkweb(see above).replay(session_file)— Loads HTTP fetch recordings fromconfigure(replay_dir=...): a JSON*.silkwebfile with anhtml_pathkey pointing at a sibling.htmlfile. Missing metadata or files raiseSilkwebErrorwith a clear message. Not the same as Playwright session replay.
Cookie expiry detection¶
When fetching with a session, Silkweb checks auth cookies for expiry:
import asyncio
from silkweb.exceptions import SilkwebSessionExpiredError
from silkweb.session.session import SilkSession
async def main():
session = SilkSession("my-site")
try:
page = await session.fetch("https://example.com/dashboard")
except SilkwebSessionExpiredError as e:
print(e) # suggests re-recording, e.g. await silkweb.record_session("my-site")
asyncio.run(main())
Cookies identified as auth-related (containing session, auth, token, jwt, or sid in the name) are checked for expires timestamps.
CLI¶
# Record a session
silkweb shell https://example.com/login
# Fetch with a session (via the Python API)
python -c "
from silkweb.session.session import SilkSession
import asyncio
async def main():
s = SilkSession('my-site')
page = await s.fetch('https://example.com/dashboard')
html = await page.content()
print(html[:500])
asyncio.run(main())
"