# core


<!-- WARNING: THIS FILE WAS AUTOGENERATED! DO NOT EDIT! -->

## Introduction

[The Python Standard
Library](https://docs.python.org/3/library/index.html) documentation is
very helpful for learning Python. So is
[Solveit](https://solve.it.com/)! Solveit is jupyter notebook + AI with
superpowers. Learning programming is so much fun and productive with AI.
Therefore, I wanted to convert these html python documentation pages
into solveit dialogues, which comprise small pieces of notes and code
messages with appropriate headings, which can be extracted from the
pages’ table of contents.

How it works:

- We first get the html from the python documentation web page.
- We turn it into `(msg_type, element)` where `msg_type` is `note` or
  `code` and `element` is soup element.
- Turn `element`s into appropriate solveit messages for the dialog.

The goal is to use `#` for the title, `##` for subheading, and `###` for
each function definition from the docs.

First, we grab html from the documentation and create `soup`.

``` python
doc_url = 'https://docs.python.org/3/library/random.html'
doc_html = httpx.get(doc_url).text
doc_html[:600]
```

    '<!DOCTYPE html>\n\n<html lang="en" data-content_root="../">\n  <head>\n    <meta charset="utf-8" />\n    <meta name="viewport" content="width=device-width, initial-scale=1.0" /><meta name="viewport" content="width=device-width, initial-scale=1" />\n<meta property="og:title" content="random — Generate pseudo-random numbers" />\n<meta property="og:type" content="website" />\n<meta property="og:url" content="https://docs.python.org/3/library/random.html" />\n<meta property="og:site_name" content="Python documentation" />\n<meta property="og:description" content="Source code: Lib/random.py This module imple'

``` python
soup = BeautifulSoup(doc_html, 'html.parser')
```

## Some helpful utilities

Here are some utility functions for getting the main content, cleaning
text, getting title, etc.

------------------------------------------------------------------------

<a
href="https://github.com/galopyz/dialogify/blob/main/dialogify/core.py#L23"
target="_blank" style="float:right; font-size:smaller">source</a>

### get_main

``` python

def get_main(
    soup
):

```

*Extract the main content section from Python docs soup*

``` python
ms = get_main(soup); str(ms)[:300]
```

    '<section id="module-random">\n<span id="random-generate-pseudo-random-numbers"></span><h1><code class="xref py py-mod docutils literal notranslate"><span class="pre">random</span></code> — Generate pseudo-random numbers<a class="headerlink" href="#module-random" title="Link to this heading">¶</a></h1'

------------------------------------------------------------------------

<a
href="https://github.com/galopyz/dialogify/blob/main/dialogify/core.py#L28"
target="_blank" style="float:right; font-size:smaller">source</a>

### clean_txt

``` python

def clean_txt(
    el
):

```

*Clean element text by removing paragraph signs and extra whitespace*

------------------------------------------------------------------------

<a
href="https://github.com/galopyz/dialogify/blob/main/dialogify/core.py#L33"
target="_blank" style="float:right; font-size:smaller">source</a>

### get_title

``` python

def get_title(
    section
):

```

*Extract the h1 title from a section*

``` python
get_title(ms)
```

    'random — Generate pseudo-random numbers'

Before turning the `soup` into markdown, we turn these into each
sections as in `(title, section)` tuples.

------------------------------------------------------------------------

<a
href="https://github.com/galopyz/dialogify/blob/main/dialogify/core.py#L38"
target="_blank" style="float:right; font-size:smaller">source</a>

### get_sections

``` python

def get_sections(
    main
):

```

*Get all direct child sections as (title, section_element) tuples*

``` python
len(get_sections(ms))
```

    12

We can grab sections and grab the bookkeeping section

``` python
sts = get_sections(ms)
bk = sts[0][1]
str(bk)[:300]
```

    '<section id="bookkeeping-functions">\n<h2>Bookkeeping functions<a class="headerlink" href="#bookkeeping-functions" title="Link to this heading">¶</a></h2>\n<dl class="py function">\n<dt class="sig sig-object py" id="random.seed">\n<span class="sig-prename descclassname"><span class="pre">random.</span><'

Looking at the preview to check if it is looking good.

------------------------------------------------------------------------

<a
href="https://github.com/galopyz/dialogify/blob/main/dialogify/core.py#L43"
target="_blank" style="float:right; font-size:smaller">source</a>

### preview_msgs

``` python

def preview_msgs(
    msgs
):

```

*Preview message tuples as rendered markdown*

``` python
preview_msgs(get_sections(ms)[:2])
```

**\[Bookkeeping functions\]**

<section id="bookkeeping-functions">

<h2>

Bookkeeping
functions<a class="headerlink" href="#bookkeeping-functions" title="Link to this heading">¶</a>
</h2>

<dl class="py function">

<dt class="sig sig-object py" id="random.seed">

<span class="sig-prename descclassname"><span class="pre">random.</span></span><span class="sig-name descname"><span class="pre">seed</span></span><span class="sig-paren">(</span><em class="sig-param"><span class="n"><span class="pre">a</span></span><span class="o"><span class="pre">=</span></span><span class="default_value"><span class="pre">None</span></span></em>,
<em class="sig-param"><span class="n"><span class="pre">version</span></span><span class="o"><span class="pre">=</span></span><span class="default_value"><span class="pre">2</span></span></em><span class="sig-paren">)</span><a class="headerlink" href="#random.seed" title="Link to this definition">¶</a>
</dt>

<dd>

<p>

Initialize the random number generator.
</p>

<p>

If <em>a</em> is omitted or
<code class="docutils literal notranslate"><span class="pre">None</span></code>,
the current system time is used. If randomness sources are provided by
the operating system, they are used instead of the system time (see the
<a class="reference internal" href="os.html#os.urandom" title="os.urandom"><code class="xref py py-func docutils literal notranslate"><span class="pre">os.urandom()</span></code></a>
function for details on availability).
</p>

<p>

If <em>a</em> is an int, its absolute value is used directly.
</p>

<p>

With version 2 (the default), a
<a class="reference internal" href="stdtypes.html#str" title="str"><code class="xref py py-class docutils literal notranslate"><span class="pre">str</span></code></a>,
<a class="reference internal" href="stdtypes.html#bytes" title="bytes"><code class="xref py py-class docutils literal notranslate"><span class="pre">bytes</span></code></a>,
or
<a class="reference internal" href="stdtypes.html#bytearray" title="bytearray"><code class="xref py py-class docutils literal notranslate"><span class="pre">bytearray</span></code></a>
object gets converted to an
<a class="reference internal" href="functions.html#int" title="int"><code class="xref py py-class docutils literal notranslate"><span class="pre">int</span></code></a>
and all of its bits are used.
</p>

<p>

With version 1 (provided for reproducing random sequences from older
versions of Python), the algorithm for
<a class="reference internal" href="stdtypes.html#str" title="str"><code class="xref py py-class docutils literal notranslate"><span class="pre">str</span></code></a>
and
<a class="reference internal" href="stdtypes.html#bytes" title="bytes"><code class="xref py py-class docutils literal notranslate"><span class="pre">bytes</span></code></a>
generates a narrower range of seeds.
</p>

<div class="versionchanged">

<p>

<span class="versionmodified changed">Changed in version 3.2:
</span>Moved to the version 2 scheme which uses all of the bits in a
string seed.
</p>

</div>

<div class="versionchanged">

<p>

<span class="versionmodified changed">Changed in version 3.11:
</span>The <em>seed</em> must be one of the following types:
<code class="docutils literal notranslate"><span class="pre">None</span></code>,
<a class="reference internal" href="functions.html#int" title="int"><code class="xref py py-class docutils literal notranslate"><span class="pre">int</span></code></a>,
<a class="reference internal" href="functions.html#float" title="float"><code class="xref py py-class docutils literal notranslate"><span class="pre">float</span></code></a>,
<a class="reference internal" href="stdtypes.html#str" title="str"><code class="xref py py-class docutils literal notranslate"><span class="pre">str</span></code></a>,
<a class="reference internal" href="stdtypes.html#bytes" title="bytes"><code class="xref py py-class docutils literal notranslate"><span class="pre">bytes</span></code></a>,
or
<a class="reference internal" href="stdtypes.html#bytearray" title="bytearray"><code class="xref py py-class docutils literal notranslate"><span class="pre">bytearray</span></code></a>.
</p>

</div>

</dd>

</dl>

<dl class="py function">

<dt class="sig sig-object py" id="random.getstate">

<span class="sig-prename descclassname"><span class="pre">random.</span></span><span class="sig-name descname"><span class="pre">getstate</span></span><span class="sig-paren">(</span><span class="sig-paren">)</span><a class="headerlink" href="#random.getstate" title="Link to this definition">¶</a>
</dt>

<dd>

<p>

Return an object capturing the current internal state of the generator.
This object can be passed to
<a class="reference internal" href="#random.setstate" title="random.setstate"><code class="xref py py-func docutils literal notranslate"><span class="pre">setstate()</span></code></a>
to restore the state.
</p>

</dd>

</dl>

<dl class="py function">

<dt class="sig sig-object py" id="random.setstate">

<span class="sig-prename descclassname"><span class="pre">random.</span></span><span class="sig-name descname"><span class="pre">setstate</span></span><span class="sig-paren">(</span><em class="sig-param"><span class="n"><span class="pre">state</span></span></em><span class="sig-paren">)</span><a class="headerlink" href="#random.setstate" title="Link to this definition">¶</a>
</dt>

<dd>

<p>

<em>state</em> should have been obtained from a previous call to
<a class="reference internal" href="#random.getstate" title="random.getstate"><code class="xref py py-func docutils literal notranslate"><span class="pre">getstate()</span></code></a>,
and
<a class="reference internal" href="#random.setstate" title="random.setstate"><code class="xref py py-func docutils literal notranslate"><span class="pre">setstate()</span></code></a>
restores the internal state of the generator to what it was at the time
<a class="reference internal" href="#random.getstate" title="random.getstate"><code class="xref py py-func docutils literal notranslate"><span class="pre">getstate()</span></code></a>
was called.
</p>

</dd>

</dl>

</section>

**\[Functions for bytes\]**

<section id="functions-for-bytes">

<h2>

Functions for
bytes<a class="headerlink" href="#functions-for-bytes" title="Link to this heading">¶</a>
</h2>

<dl class="py function">

<dt class="sig sig-object py" id="random.randbytes">

<span class="sig-prename descclassname"><span class="pre">random.</span></span><span class="sig-name descname"><span class="pre">randbytes</span></span><span class="sig-paren">(</span><em class="sig-param"><span class="n"><span class="pre">n</span></span></em><span class="sig-paren">)</span><a class="headerlink" href="#random.randbytes" title="Link to this definition">¶</a>
</dt>

<dd>

<p>

Generate <em>n</em> random bytes.
</p>

<p>

This method should not be used for generating security tokens. Use
<a class="reference internal" href="secrets.html#secrets.token_bytes" title="secrets.token_bytes"><code class="xref py py-func docutils literal notranslate"><span class="pre">secrets.token_bytes()</span></code></a>
instead.
</p>

<div class="versionadded">

<p>

<span class="versionmodified added">Added in version 3.9.</span>
</p>

</div>

</dd>

</dl>

</section>

[`html_to_md`](https://galopyz.github.io/dialogify/core.html#html_to_md)
turns html into md for appropriate tags.

------------------------------------------------------------------------

<a
href="https://github.com/galopyz/dialogify/blob/main/dialogify/core.py#L49"
target="_blank" style="float:right; font-size:smaller">source</a>

### html_to_md

``` python

def html_to_md(
    el, in_link:bool=False
):

```

*Recursively convert HTML element to markdown string*

------------------------------------------------------------------------

<a
href="https://github.com/galopyz/dialogify/blob/main/dialogify/core.py#L64"
target="_blank" style="float:right; font-size:smaller">source</a>

### html_to_md_children

``` python

def html_to_md_children(
    el, in_link:bool=False
):

```

*Convert all children of an HTML element to markdown*

``` python
print(html_to_md(bk))
```


    Bookkeeping functions[¶](#bookkeeping-functions)


    random.seed(*a=None*, *version=2*)[¶](#random.seed)
    Initialize the random number generator.
    If *a* is omitted or `None`, the current system time is used.  If
    randomness sources are provided by the operating system, they are used
    instead of the system time (see the [os.urandom()](os.html#os.urandom) function for details
    on availability).
    If *a* is an int, its absolute value is used directly.
    With version 2 (the default), a [str](stdtypes.html#str), [bytes](stdtypes.html#bytes), or [bytearray](stdtypes.html#bytearray)
    object gets converted to an [int](functions.html#int) and all of its bits are used.
    With version 1 (provided for reproducing random sequences from older versions
    of Python), the algorithm for [str](stdtypes.html#str) and [bytes](stdtypes.html#bytes) generates a
    narrower range of seeds.

    Changed in version 3.2: Moved to the version 2 scheme which uses all of the bits in a string seed.


    Changed in version 3.11: The *seed* must be one of the following types:
    `None`, [int](functions.html#int), [float](functions.html#float), [str](stdtypes.html#str),
    [bytes](stdtypes.html#bytes), or [bytearray](stdtypes.html#bytearray).




    random.getstate()[¶](#random.getstate)
    Return an object capturing the current internal state of the generator.  This
    object can be passed to [setstate()](#random.setstate) to restore the state.



    random.setstate(*state*)[¶](#random.setstate)
    *state* should have been obtained from a previous call to [getstate()](#random.getstate), and
    [setstate()](#random.setstate) restores the internal state of the generator to what it was at
    the time [getstate()](#random.getstate) was called.

## `soup` to `(msg_type, el)`

Solveit messages have `Code`, `Note`, `Prompt`, and `Raw` for message
types. But we want to focus on `note` and `code` for creating dialogs.
By turning `soup` into `(msg_type, el)`, we can easily turn those into
sovleit messages with markdown.

------------------------------------------------------------------------

<a
href="https://github.com/galopyz/dialogify/blob/main/dialogify/core.py#L69"
target="_blank" style="float:right; font-size:smaller">source</a>

### has_cls

``` python

def has_cls(
    el, cls
):

```

*Check if element has a specific CSS class*

`dt` is special because it is used for function definition in python
docs.

------------------------------------------------------------------------

<a
href="https://github.com/galopyz/dialogify/blob/main/dialogify/core.py#L74"
target="_blank" style="float:right; font-size:smaller">source</a>

### get_msg_type

``` python

def get_msg_type(
    el
):

```

*Determine message type (‘note’, ‘code’, or ‘dt’) for an HTML element*

------------------------------------------------------------------------

<a
href="https://github.com/galopyz/dialogify/blob/main/dialogify/core.py#L87"
target="_blank" style="float:right; font-size:smaller">source</a>

### collect_msgs

``` python

def collect_msgs(
    el
):

```

*Recursively collect (msg_type, element) tuples from HTML tree*

------------------------------------------------------------------------

<a
href="https://github.com/galopyz/dialogify/blob/main/dialogify/core.py#L98"
target="_blank" style="float:right; font-size:smaller">source</a>

### format_msg

``` python

def format_msg(
    msg_type, el
):

```

*Convert (msg_type, element) tuple to (msg_type, markdown_string)*

------------------------------------------------------------------------

<a
href="https://github.com/galopyz/dialogify/blob/main/dialogify/core.py#L113"
target="_blank" style="float:right; font-size:smaller">source</a>

### table_to_md

``` python

def table_to_md(
    table
):

```

*Convert HTML table to markdown format*

Some functions/classes on the doc has multiple signatures. In this case,
`dt`s need to be merged into a single message as a heading.

------------------------------------------------------------------------

<a
href="https://github.com/galopyz/dialogify/blob/main/dialogify/core.py#L125"
target="_blank" style="float:right; font-size:smaller">source</a>

### merge_dt

``` python

def merge_dt(
    msgs
):

```

*Merge consecutive ‘dt’ messages into single heading notes*

------------------------------------------------------------------------

<a
href="https://github.com/galopyz/dialogify/blob/main/dialogify/core.py#L134"
target="_blank" style="float:right; font-size:smaller">source</a>

### format_msgs

``` python

def format_msgs(
    el
):

```

*Convert HTML element to list of formatted (msg_type, markdown) tuples*

Let’s try it on `bytearray` function from the
“https://docs.python.org/3.12/library/functions.html”.

``` python
bytearray_html = '''<dl class="py class" id="func-bytearray">
<dt class="sig sig-object py">
<em class="property"><span class="k"><span class="pre">class</span></span><span class="w"> </span></em><span class="sig-name descname"><span class="pre">bytearray</span></span><span class="sig-paren">(</span><em class="sig-param"><span class="n"><span class="pre">source</span></span><span class="o"><span class="pre">=</span></span><span class="default_value"><span class="pre">b''</span></span></em><span class="sig-paren">)</span></dt>
<dt class="sig sig-object py">
<em class="property"><span class="k"><span class="pre">class</span></span><span class="w"> </span></em><span class="sig-name descname"><span class="pre">bytearray</span></span><span class="sig-paren">(</span><em class="sig-param"><span class="n"><span class="pre">source</span></span></em>, <em class="sig-param"><span class="n"><span class="pre">encoding</span></span></em><span class="sig-paren">)</span></dt>
<dt class="sig sig-object py">
<em class="property"><span class="k"><span class="pre">class</span></span><span class="w"> </span></em><span class="sig-name descname"><span class="pre">bytearray</span></span><span class="sig-paren">(</span><em class="sig-param"><span class="n"><span class="pre">source</span></span></em>, <em class="sig-param"><span class="n"><span class="pre">encoding</span></span></em>, <em class="sig-param"><span class="n"><span class="pre">errors</span></span></em><span class="sig-paren">)</span></dt>
<dd><p>Return a new array of bytes.</p>
<p>The optional <em>source</em> parameter can be used to initialize the array:</p>
<ul class="simple">
<li><p>If it is a <em>string</em>, you must also give the <em>encoding</em>.</p></li>
<li><p>If it is an <em>integer</em>, the array will have that size.</p></li>
</ul>
<p>Without an argument, an array of size 0 is created.</p>
</dd></dl>'''
```

``` python
ba_soup = BeautifulSoup(bytearray_html, 'html.parser')
preview_msgs(collect_msgs(ba_soup.dl))
```

**\[dt\]**

<dt class="sig sig-object py">

<em class="property"><span class="k"><span class="pre">class</span></span><span class="w">
</span></em><span class="sig-name descname"><span class="pre">bytearray</span></span><span class="sig-paren">(</span><em class="sig-param"><span class="n"><span class="pre">source</span></span><span class="o"><span class="pre">=</span></span><span class="default_value"><span class="pre">b’’</span></span></em><span class="sig-paren">)</span>
</dt>

**\[dt\]**

<dt class="sig sig-object py">

<em class="property"><span class="k"><span class="pre">class</span></span><span class="w">
</span></em><span class="sig-name descname"><span class="pre">bytearray</span></span><span class="sig-paren">(</span><em class="sig-param"><span class="n"><span class="pre">source</span></span></em>,
<em class="sig-param"><span class="n"><span class="pre">encoding</span></span></em><span class="sig-paren">)</span>
</dt>

**\[dt\]**

<dt class="sig sig-object py">

<em class="property"><span class="k"><span class="pre">class</span></span><span class="w">
</span></em><span class="sig-name descname"><span class="pre">bytearray</span></span><span class="sig-paren">(</span><em class="sig-param"><span class="n"><span class="pre">source</span></span></em>,
<em class="sig-param"><span class="n"><span class="pre">encoding</span></span></em>,
<em class="sig-param"><span class="n"><span class="pre">errors</span></span></em><span class="sig-paren">)</span>
</dt>

**\[note\]**

<p>

Return a new array of bytes.
</p>

**\[note\]**

<p>

The optional <em>source</em> parameter can be used to initialize the
array:
</p>

**\[note\]**

<ul class="simple">

<li>

<p>

If it is a <em>string</em>, you must also give the <em>encoding</em>.
</p>

</li>

<li>

<p>

If it is an <em>integer</em>, the array will have that size.
</p>

</li>

</ul>

**\[note\]**

<p>

Without an argument, an array of size 0 is created.
</p>

``` python
ba_msgs = format_msgs(ba_soup)
ba_msgs
```

    [('note',
      "### `class bytearray(source=b'')`\n### `class bytearray(source, encoding)`\n### `class bytearray(source, encoding, errors)`"),
     ('note', 'Return a new array of bytes.'),
     ('note',
      'The optional *source* parameter can be used to initialize the array:'),
     ('note',
      '\n- If it is a *string*, you must also give the *encoding*.\n- If it is an *integer*, the array will have that size.\n'),
     ('note', 'Without an argument, an array of size 0 is created.')]

``` python
merge_dt(ba_msgs)
```

    [('note',
      "### `class bytearray(source=b'')`\n### `class bytearray(source, encoding)`\n### `class bytearray(source, encoding, errors)`"),
     ('note', 'Return a new array of bytes.'),
     ('note',
      'The optional *source* parameter can be used to initialize the array:'),
     ('note',
      '\n- If it is a *string*, you must also give the *encoding*.\n- If it is an *integer*, the array will have that size.\n'),
     ('note', 'Without an argument, an array of size 0 is created.')]

``` python
preview_msgs(format_msgs(ba_soup))
```

**\[note\]**

### `class bytearray(source=b'')`

### `class bytearray(source, encoding)`

### `class bytearray(source, encoding, errors)`

**\[note\]**

Return a new array of bytes.

**\[note\]**

The optional *source* parameter can be used to initialize the array:

**\[note\]**

- If it is a *string*, you must also give the *encoding*.
- If it is an *integer*, the array will have that size.

**\[note\]**

Without an argument, an array of size 0 is created.

Looks good! We can use `create_msg` to create solveit messages.

------------------------------------------------------------------------

<a
href="https://github.com/galopyz/dialogify/blob/main/dialogify/core.py#L139"
target="_blank" style="float:right; font-size:smaller">source</a>

### create_msgs

``` python

def create_msgs(
    doc_tuples, dname:str='', kwargs:VAR_KEYWORD
):

```

*Create solveit messages from list of (msg_type, content) tuples*

``` python
# create_msgs(format_msgs(ms))
```

And we can make dialogs.

------------------------------------------------------------------------

<a
href="https://github.com/galopyz/dialogify/blob/main/dialogify/core.py#L144"
target="_blank" style="float:right; font-size:smaller">source</a>

### mk_dialog

``` python

def mk_dialog(
    url, dname:str=''
):

```

*Fetch Python docs URL and create a solveit dialog from it*

Here are examples to create solveit dialogs:

``` python
# mk_dialog('https://docs.python.org/3.12/library/functions.html', dname='dialogify/testing')
```

``` python
# mk_dialog('https://docs.python.org/3.12/howto/regex.html#regex-howto', dname='dialogify/regex_howto')
```

``` python
# mk_dialog('https://docs.python.org/3.12/howto/regex.html#regex-howto')
```
