streamlit进阶使用

一. 前言

在Streamlit使用指南 | Lian中对streamlit的使用进行了整体的介绍, 但是缺乏对于细节的, 这里作补充.

1.1 session

当前会话, Window.sessionStorage - Web API | MDN

sessionStorage 属性允许你访问一个, 对应当前源的 session Storage 对象. 它与 localStorage 相似, 不同之处在于 localStorage 里面存储的数据没有过期时间设置, 而存储在 sessionStorage 里面的数据在页面会话结束时会被清除.

页面会话在浏览器打开期间一直保持, 并且重新加载或恢复页面仍会保持原来的页面会话.

**在新标签或窗口打开一个页面时会复制顶级浏览会话的上下文作为新会话的上下文, 这点和 session cookie 的运行方式不同. **

打开多个相同的 URL 的 Tabs 页面, 会创建各自的 sessionStorage.

关闭对应浏览器标签或窗口, 会清除对应的 sessionStorage.

sessionStorage.setItem('test', JSON.stringify({a: 1}))

和浏览器中sessionstorage的机制略微有所差异.

st.session_state

st.session_state, 这里存储的数据, 在页面刷新会丢失(需要注意), 其他机制和上述的一致.

二. 基本执行逻辑

streamlit的执行逻辑非常简单, 页面初始化, 任何的交互行为理论上都会重新触发页面的重新加载.

即, 任意的按钮点击, 下拉框选择等看似简单的操作.

这种逻辑的好处很明显, 就是处理起来非常简单直接.

但是坏处也明显, 复杂的操作变得很难, 同时需要考虑一些耗时的操作的处理方式以避免对页面影响过大.

所以st的关键就在于控制页面交互之后引发的问题.

减少页面重载的部分
部分变量数据不能立刻消失(session状态)
部分数据的请求不能次次都操作(interval cache)
需要存在全局变量不能消失(global cache)

三. 重要的API

3.1 button

st.button - Streamlit Docs

Button behavior and examples - Streamlit Docs

st.button(label, key=None, help=None, on_click=None, args=None, kwargs=None, *, type="secondary", icon=None, disabled=False, use_container_width=False)

需要注意的参数是key, st不允许存在label的button, 这个参数就是用于加以区分的.

if st.button("a"):
    print("a")

st.markdown('---')

if st.button("a"):
    print("b")

  	# error

按钮, 最为常见的控件, 但这里需要注意的是, 按钮点击之后, 改变按钮本身的操作.

import streamlit as st

if "button_state" not in st.session_state:
    st.session_state.button_state = False

label = 'clicked' if st.session_state.button_state else 'not clicked'

if st.button(label=label):
    st.session_state.button_state = not st.session_state.button_state
    print('click')

点击后名称的改变.

假如认为上述的代码可以实现, 这实际上是不行的, 具体的实现见下面内容.

3.2 submit

st.form - Streamlit Docs

Create a form that batches elements together with a "Submit" button.

A form is a container that visually groups other elements and widgets together, and contains a Submit button. When the form's Submit button is pressed, all widget values inside the form will be sent to Streamlit in a batch.

To add elements to a form object, you can use with notation (preferred) or just call methods directly on the form. See examples below.

Forms have a few constraints:

Every form must contain a st.form_submit_button.

st.button and st.download_button cannot be added to a form.

Forms can appear anywhere in your app (sidebar, columns, etc), but they cannot be embedded inside other forms.

Within a form, the only widget that can have a callback function is st.form_submit_button.

创建一个有提交表单按钮的模块.

with st.form('power'):
    st.subheader('power mode')
    p = st.selectbox('', ('balance', 'performance', 'energy'))
    if st.form_submit_button('apply', icon=':material/charger:'):
        adjust_power_module(p)

在前面执行逻辑已经知道, 任何的页面操作在streamlit都会引发页面重载, 显然一些操作将会被打断, 例如需要选择数据后, 然后再执行的操作.

3.3 st.empty

st.empty - Streamlit Docs

空白容器, 同一位置放置统一元素.

import streamlit as st
import time

with st.empty():
    for seconds in range(10):
        st.write(f"⏳ {seconds} seconds have passed")
        time.sleep(1)
    st.write(":material/check: 10 seconds over!")
st.button("Rerun")

3.4 fragments

st.fragment - Streamlit Docs

Decorator to turn a function into a fragment which can rerun independently of the full app.

When a user interacts with an input widget created inside a fragment, Streamlit only reruns the fragment instead of the full app. If run_every is set, Streamlit will also rerun the fragment at the specified interval while the session is active, even if the user is not interacting with your app.

To trigger an app rerun from inside a fragment, call st.rerun() directly. To trigger a fragment rerun from within itself, call st.rerun(scope="fragment"). Any values from the fragment that need to be accessed from the wider app should generally be stored in Session State.

When Streamlit element commands are called directly in a fragment, the elements are cleared and redrawn on each fragment rerun, just like all elements are redrawn on each app rerun. The rest of the app is persisted during a fragment rerun. When a fragment renders elements into externally created containers, the elements will not be cleared with each fragment rerun. Instead, elements will accumulate in those containers with each fragment rerun, until the next app rerun.

Calling st.sidebar in a fragment is not supported. To write elements to the sidebar with a fragment, call your fragment function inside a with st.sidebar context manager.

Fragment code can interact with Session State, imported modules, and other Streamlit elements created outside the fragment. Note that these interactions are additive across multiple fragment reruns. You are responsible for handling any side effects of that behavior.

在前面的执行逻辑可知, 每次的交互会导致页面重新加载.

import streamlit as st
from utils import music_module as mm

st.set_page_config(page_title="Music", page_icon=":material/queue_music:")
st.markdown('# Music')

# 分块执行, 当操作在块中, 重新载入页面内容时只会执行块的内容
@st.fragment
def control():
    with st.container():
        c1, c2, c3, c4 = st.columns(4)
        with c1:
            if st.button('next', icon=':material/skip_next:'):
                mm.next()
        with c2:
            if st.button(label='play', icon=f':material/play_circle:'):
                mm.play()
        with c3:
            if st.button('pause', icon=':material/pause_circle:'):
                mm.pause()
        with c4:
            if st.button('stop', icon=':material/stop_circle:'):
                mm.stop()

@st.fragment
def volume():
    st.markdown('---')
    with st.container():
        st.subheader('volume')
        c1, c2 = st.columns(2)
        with c1:
            if st.button('volume-', icon=':material/volume_down:'):
                mm.adjust_volume(False)
        with c2:
            if st.button('volume+', icon=':material/volume_up:'):
                mm.adjust_volume(True)

volume()
control()

这个装饰器就可以避免全面重载的情况, 通过对页面不同模块进行划分, 使之局部重载, 减少影响.

import streamlit as st

if "button_state" not in st.session_state:
    st.session_state.button_state = False

@st.fragment
def testa():
    label = 'clicked' if st.session_state.button_state else 'not clicked'

    if st.button(label=label):
        st.session_state.button_state = not st.session_state.button_state
        print('clickx1')
        st.rerun(scope='fragment')

testa()

def testb():
    print('a')

testb()

假如是在一个带有fragment装饰器的函数之内执行的, 将只重启这个函数片段.

3.5 rerun

st.rerun - Streamlit Docs

Rerun the script immediately.

When st.rerun() is called, Streamlit halts the current script run and executes no further statements. Streamlit immediately queues the script to rerun.

When using st.rerun in a fragment, you can scope the rerun to the fragment. However, if a fragment is running as part of a full-app rerun, a fragment-scoped rerun is not allowed.

st.rerun(*, scope="app")

Specifies what part of the app should rerun. If scope is "app" (default), the full app reruns. If scope is "fragment", Streamlit only reruns the fragment from which this command is called.

Setting scope="fragment" is only valid inside a fragment during a fragment rerun. If st.rerun(scope="fragment") is called during a full-app rerun or outside of a fragment, Streamlit will raise a StreamlitAPIException.

脚本重运行, 这里支持两个参数, 整体, 局部, 默认整体.

import streamlit as st

if "button_state" not in st.session_state:
    st.session_state.button_state = False

label = 'clicked' if st.session_state.button_state else 'not clicked'

if st.button(label=label):
    st.session_state.button_state = not st.session_state.button_state
    print('click')
    st.rerun()

3.6 st.dialog

st.dialog - Streamlit Docs

import streamlit as st

@st.dialog('confirm! shutdown?')
def shutdown():
    c1, c2 = st.columns(2)
    with c1:
        if st.button('ok'):
            print('shutdown /s /t 0')
            st.rerun()
    with c2:
        if st.button('cancel'):
            print('cancel')
            st.rerun() # 必须加这个, 否则窗体不消失

if st.button('shutdown', icon=':material/power:'):
    shutdown()

间接实现yes/no.

3.7 session state

Session State - Streamlit Docs

在session中保存状态.

3.8 cache_data

st.cache_data - Streamlit Docs

Decorator to cache functions that return data (e.g. dataframe transforms, database queries, ML inference).

Cached objects are stored in "pickled" form, which means that the return value of a cached function must be pickleable. Each caller of the cached function gets its own copy of the cached data.

You can clear a function's cache with func.clear() or clear the entire cache with st.cache_data.clear().

A function's arguments must be hashable to cache it. If you have an unhashable argument (like a database connection) or an argument you want to exclude from caching, use an underscore prefix in the argument name. In this case, Streamlit will return a cached value when all other arguments match a previous function call. Alternatively, you can declare custom hashing functions with hash_funcs.

To cache global resources, use st.cache_resource instead. Learn more about caching at https://docs.streamlit.io/develop/concepts/architecture/caching.

import streamlit as st

@st.cache_data
def get_data():
    print("get_data")
    return [1, 2, 3]

if st.button("Click me"):
    data = get_data()
    print(data)

全局缓存数据, 避免多次的请求产生.

3.9 cache_resource

st.cache_resource - Streamlit Docs

Decorator to cache functions that return global resources (e.g. database connections, ML models).

Cached objects are shared across all users, sessions, and reruns. They must be thread-safe because they can be accessed from multiple threads concurrently. If thread safety is an issue, consider using st.session_state to store resources per session instead.

You can clear a function's cache with func.clear() or clear the entire cache with st.cache_resource.clear().

A function's arguments must be hashable to cache it. If you have an unhashable argument (like a database connection) or an argument you want to exclude from caching, use an underscore prefix in the argument name. In this case, Streamlit will return a cached value when all other arguments match a previous function call. Alternatively, you can declare custom hashing functions with hash_funcs.

To cache data, use st.cache_data instead. Learn more about caching at https://docs.streamlit.io/develop/concepts/architecture/caching.

import streamlit as st

class Test:
    def __init__(self):
        print('init')

    def do_something(self):
        print('doing something')

@st.cache_resource
def get_test():
    return Test()

if st.button('Click me'):
    test = get_test()
    test.do_something()

全局缓存对象.

3.10 小结

	st.cache_data	st.cache_resource
使用场景	适用于缓存函数的输出结果, 特别是那些返回可序列化数据对象的函数	适用于缓存那些需要初始化但不需要频繁重新计算的对象, 如数据库连接, 模型加载等
特点	缓存的是函数的输出结果, 适合频繁调用且输出结果可能变化的场景	缓存的是资源对象本身, 适合初始化耗时但不需要频繁更新的场景
缓存内容示例	从 API 获取数据, 加载 CSV 文件, 数据处理等	加载预训练模型, 建立数据库连接等

需要在全局范围内统一使用的对象, 减少重复加载的时间消耗.

四. 其他

4.1 文件夹的选择

st.file_uploader - Streamlit Docs

st没有原生选择文件夹的实现, 这里借助tk来实现相关功能.

import streamlit as st
import tkinter as tk
from tkinter import filedialog

root = tk.Tk()
root.withdraw()

root.wm_attributes('-topmost', 1)
clicked = st.button('Folder Picker')
if clicked:
    dirname = filedialog.askdirectory(master=root)
    if dirname:
        st.write('You selected:', dirname)

4.2 多页面

Navigation and pages - Streamlit Docs

实现的方式有两种, 使用pages文件夹; 使用st.navigation统一配置, 不需要pages文件夹, 需要新版本的st才支持.

from pathlib import Path

import streamlit as st

dir_path = Path(__file__).parent

# Note that this needs to be in a method so we can have an e2e playwright test.
def run():
    page = st.navigation(
        [
            st.Page(
                dir_path / "hello.py", title="Hello", icon=":material/waving_hand:"
            ),
            st.Page(
                dir_path / "dataframe_demo.py",
                title="DataFrame demo",
                icon=":material/table:",
            ),
            st.Page(
                dir_path / "plotting_demo.py",
                title="Plotting demo",
                icon=":material/show_chart:",
            ),
            st.Page(
                dir_path / "mapping_demo.py",
                title="Mapping demo",
                icon=":material/public:",
            ),
            st.Page(
                dir_path / "animation_demo.py",
                title="Animation demo",
                icon=":material/animation:",
            ),
        ]
    )
    page.run()

if __name__ == "__main__":
    run()

4.3 chat

实现各种ai的chat模式.

import streamlit as st
import random
import time

# Streamed response emulator
def response_generator():
    response = random.choice(
        [
            "Hello there! How can I assist you today?",
            "Hi, human! Is there anything I can help you with?",
            "Do you need help?",
        ]
    )
    for word in response.split():
        yield word + " "
        time.sleep(0.05)

st.title("Simple chat")

# Initialize chat history
if "messages" not in st.session_state:
    st.session_state.messages = []

# Display chat messages from history on app rerun
for message in st.session_state.messages:
    with st.chat_message(message["role"]):
        st.markdown(message["content"])

# Accept user input
if prompt := st.chat_input("What is up?"):
    # Add user message to chat history
    st.session_state.messages.append({"role": "user", "content": prompt})
    # Display user message in chat message container
    with st.chat_message("user"):
        st.markdown(prompt)

    # Display assistant response in chat message container
    with st.chat_message("assistant"):
        response = st.write_stream(response_generator())
    # Add assistant response to chat history
    st.session_state.messages.append({"role": "assistant", "content": response})

4.4 数学公式

st.markdown('''
$$
f(x) = x ^ 2 \\\\
e = mc ^ 2
$$
''')

需要注意这一点, 在很多类似的软件在markdown格式转html时对转义字符的处理.

import streamlit as st

# 需要注意换行符, 会转义, 所以\\ 需要四个 \\\\来实现
st.latex('''
f(x) = x ^ 2 \\\\
e = mc ^ 2
''')

不需要st.markdown

五. 总结

streamlit, 只要理解了其中的执行逻辑, 处理好交互 - 页面刷新过程产生的问题, 是一个非常容易使用的前端框架, 搭建小型 app或者产品原型等会变得非常简单.

当然假如要实现很复杂的功能也不是不行, 如塞进可拖拽的数据可视化:

Use Pygwalker In Streamlit - Streamlit