Markdown文档嵌入base64图片

一. 前言

独立博客最麻烦就是图片存储, 第三方的免费图床, 国内的基本没法用(乱删图片, 服务高度不稳定, 随时关站...), 国外的有几个相对稳定的, 但是访问是个问题. Gitee封禁了外链, GitHub访问又是高度不稳定的....

由于路过图床 - 免费图片上传, 专业图片外链, 免费公共图床 (imgse.com)收紧了免费上传图片, 只能将图片转为base64嵌入markdown文档, 但是由于截图不希望过度压缩, 希望保存一份截图的高清版本, 这导致图片文件非常大, 直接嵌入markdown文档, 虽然这种方式可以摆脱网络的依赖, 但是会导致文档非常臃肿.

直接嵌入base64图片没办法直接使用这种方式实现.

![image](img_url)
# 在typora上, img_url直接使用base64替换没办法直接显示图片

<img src=""></img>

直接使用img标签, 也无法显示图片.

二. base64转码

用python写一个简单脚本即可, 右键菜单调用该脚本.

相关功能, 已经在Kyouichirou/markdown_project: assistance for typora markdown file typesetting (github.com)中整合进去. 自动将本地图片转为base64.

__all__ = ['convert']

import os
import base64
import pyperclip as pc

def copy_data_to_clip(data_text: str):
    # 将生成的数据复制到剪切板上
    pc.copy(data_text)
    pc.paste()

def base64_template(image_base64: str) -> str:
    # html模板
    temp = f'<p><span class="md-image"><img alt="img" src="{image_base64}" referrerpolicy="no-referrer"></span></p>'
    return temp

def get_file_extend(file_path: str) -> str:
    # 获取文件名后缀
    return os.path.splitext(file_path)[1][1:]

def file_to_base64(file_path: str) -> str:
    # 转为base64
    with open(file_path, 'rb') as f:
        # 前缀
        pre_fix = f'data:image/{get_file_extend(file_path)};base64,'
        image_bytes = f.read()
        image_base64 = pre_fix + base64.b64encode(image_bytes).decode('utf8')
        return image_base64

def convert(file_path: str):
    copy_data_to_clip(base64_template(file_to_base64(file_path)))

2.1 图片的压缩

WebP最初在2010年发布, 目标是减少文件大小但达到和JPEG格式相同的图片质量希望能够减少图片档在网络上的发送时间. 2011年11月8日 Google开始让WebP支持无损压缩和透明色( alpha通道) 的功能. 而在2012年8月16日的参考实做libwebp 0.2.0中正式支持. 根据Google较早的测试 WebP的无损压缩比网络上找到的PNG档少了45%的文件大小, 即使这些PNG档在使用pngcrush和PNGOUT处理过 WebP还是可以减少28%的文件大小.

由于截图的保存是png文件, 在嵌入图片前, 先对文件进行压缩, 一开始的想法是将图片转为webp, 这样可以大幅减小文件的体积, 同时保留文件的清晰度, 准备的是使用支持webp的截图工具, 但是查找一下, 没发现原生支持webp格式的截图工具, 再加上win10也没有原生支持webp. 最后截图还是使用png文件, 之后再对文件转换为webp, 然后在根据大小, 进一步压缩文件.

import base64
from io import BytesIO
from PIL import Image

def compress(file_path: str) -> str:
    with Image.open(file_path) as img:
        """
                Return image as a bytes object.

                .. warning::

                    This method returns the raw image data from the internal
                    storage.  For compressed image data (e.g. PNG, JPEG) use
                    :meth:`~.save`, with a BytesIO parameter for in-memory
                    data.

                :param encoder_name: What encoder to use.  The default is to
                                     use the standard "raw" encoder.
                :param args: Extra arguments to the encoder.
                :returns: A :py:class:`bytes` object.
        """

        x, y = img.size
        im_file = BytesIO()
        if x > 1440:
            # 对图片的尺寸进行压缩, 宽不能超过1440
            img_resized = img.resize((1440, int(y * 1440 / x)))
            # 图片压缩到60已经出现较为严重的肉眼可见的模糊
            img_resized.save(im_file, format='webp', quality=60)
        else:
            # 必须执行这步操作
            img.save(im_file, format='webp', quality=60)
		# 获得图片的二进制数据, tobytes()不适用于png等
        im_bytes = im_file.getvalue()
        pre_fix = f'data:image/webp;base64,'
        return pre_fix + base64.b64encode(im_bytes).decode('utf8')

Image.tobytes(encoder_name='raw', *args)[source]
Return image as a bytes object.

Warning

This method returns the raw image data from the internal storage. For compressed image data (e.g. PNG, JPEG) use save(), with a BytesIO parameter for in-memory data.

PARAMETERS:
encoder_name –

What encoder to use. The default is to use the standard " raw"  encoder.

A list of C encoders can be seen under codecs section of the function array in _imaging.c. Python encoders are registered within the relevant plugins.

args – Extra arguments to the encoder.

RETURNS:
A bytes object.

转换和压缩可以同时执行, 只需要Pillow (PIL Fork) 9.5.0 documentation即可.

虽然webp获得大部分巨型公司的使用和支持, 但是其普及程度还是远没有达到可以作为基础格式使用, 使用上还是颇为麻烦, 如win10的图片查看还是不支持webp.

经过处理后的图片, 可以在不损失图像质量的前提下, 轻易压缩到原来大小的30%; 以图像损失为代价, 可以压缩到差不多10%.

2.2 cwebp

或者是使用其他的库, 其执行webp转换的底层为cwebp | WebP | Google for Developers.

@Lian ➜ ~\Desktop ( base 3.9.12)  C:\Users\Lian\anaconda3\Lib\site-packages\lib\libwebp_win64\bin\cwebp.exe -longhelp
Usage:
 cwebp [-preset <...>] [options] in_file [-o out_file]

If input size (-s) for an image is not specified, it is
assumed to be a PNG, JPEG, TIFF or WebP file.
Windows builds can take as input any of the files handled by WIC.

Options:
  -h / -help ............. short help
  -H / -longhelp ......... long help
  -q <float> ............. quality factor (0:small..100:big), default=75
  -alpha_q <int> ......... transparency-compression quality (0..100),
                           default=100
  -preset <string> ....... preset setting, one of:
                            default, photo, picture,
                            drawing, icon, text
     -preset must come first, as it overwrites other parameters
  -z <int> ............... activates lossless preset with given
                           level in [0:fast, ..., 9:slowest]

  -m <int> ............... compression method (0=fast, 6=slowest), default=4
  -segments <int> ........ number of segments to use (1..4), default=4
  -size <int> ............ target size (in bytes)
  -psnr <float> .......... target PSNR (in dB. typically: 42)

  -s <int> <int> ......... input size (width x height) for YUV
  -sns <int> ............. spatial noise shaping (0:off, 100:max), default=50
  -f <int> ............... filter strength (0=off..100), default=60
  -sharpness <int> ....... filter sharpness (0:most .. 7:least sharp), default=0
  -strong ................ use strong filter instead of simple (default)
  -nostrong .............. use simple filter instead of strong
  -sharp_yuv ............. use sharper (and slower) RGB->YUV conversion
  -partition_limit <int> . limit quality to fit the 512k limit on
                           the first partition (0=no degradation ... 100=full)
  -pass <int> ............ analysis pass number (1..10)
  -crop <x> <y> <w> <h> .. crop picture with the given rectangle
  -resize <w> <h> ........ resize picture (after any cropping)
  -mt .................... use multi-threading if available
  -low_memory ............ reduce memory usage (slower encoding)
  -map <int> ............. print map of extra info
  -print_psnr ............ prints averaged PSNR distortion
  -print_ssim ............ prints averaged SSIM distortion
  -print_lsim ............ prints local-similarity distortion
  -d <file.pgm> .......... dump the compressed output (PGM file)
  -alpha_method <int> .... transparency-compression method (0..1), default=1
  -alpha_filter <string> . predictive filtering for alpha plane,
                           one of: none, fast (default) or best
  -exact ................. preserve RGB values in transparent area, default=off
  -blend_alpha <hex> ..... blend colors against background color
                           expressed as RGB values written in
                           hexadecimal, e.g. 0xc0e0d0 for red=0xc0
                           green=0xe0 and blue=0xd0
  -noalpha ............... discard any transparency information
  -lossless .............. encode image losslessly, default=off
  -near_lossless <int> ... use near-lossless image
                           preprocessing (0..100=off), default=100
  -hint <string> ......... specify image characteristics hint,
                           one of: photo, picture or graph

  -metadata <string> ..... comma separated list of metadata to
                           copy from the input to the output if present.
                           Valid values: all, none (default), exif, icc, xmp

  -short ................. condense printed message
  -quiet ................. don't print anything
  -version ............... print version number and exit
  -noasm ................. disable all assembly optimizations
  -v ..................... verbose, e.g. print encoding/decoding times
  -progress .............. report encoding progress

Experimental Options:
  -jpeg_like ............. roughly match expected JPEG size
  -af .................... auto-adjust filter strength
  -pre <int> ............. pre-processing filter

C:\\Users\\Lian\\anaconda3\\lib\\site-packages/lib/libwebp_win64/bin/cwebp.exe -q 40 "D:\\auto_save_snap\\a.png" -o D:\\auto_save_snap\\python_logo.webp

Saving file 'D:\\auto_save_snap\\python_logo.webp'
File:      D:\\auto_save_snap\\a.png
Dimension: 1797 x 1037
Output:    156502 bytes Y-U-V-All-PSNR 34.63 41.84 43.16   36.04 dB
           (0.67 bpp)
block count:  intra4:       3554  (48.39%)
              intra16:      3791  (51.61%)
              skipped:      3313  (45.11%)
bytes used:  header:            301  (0.2%)
             mode-partition:  18315  (11.7%)
 Residuals bytes  |segment 1|segment 2|segment 3|segment 4|  total
    macroblocks:  |      16%|      12%|      21%|      51%|    7345
      quantizer:  |      59 |      55 |      46 |      34 |
   filter level:  |      19 |      12 |      19 |      14 |

三. P标签

在<img>标签之外多套上一层.

<p><span class="md-image"><img alt="img" src="base64_url" referrerpolicy="no-referrer"></span></p>

这种方式的好处在于编辑文档时, 可以将图片的数据渲染成

四. 备注链接

![img][base64_tag]

[base64_tag]: base64_img

这种方法很方便, 但是对于文档写作过程不是很方便, 大段的字符串在编辑器上将可见, 虽然转为html这部分的内容是不可见的.