Previously, when using this theme, I found that if I inserted a code block at the beginning of the article, the entire homepage would have abnormal styling when rendered. However, at that time, I didn't delve into this issue and also thought that it was not very elegant to write a code block in the first line of the article.
Now that I have some free time, I'm ready to fix the bug in the theme. I also found that someone has raised this issue, so it's a good opportunity to study it.
Why does this issue occur?#
Create a new index.md document and insert a markdown table syntax at the beginning of the article. The rendered page will have styling issues. Open the console and check the index.html page in the source. You can see that the HTML tags related to the table are not properly closed.
Here is a code snippet:
else if post.content
- var br = 0
- for (var i = 0; i < 5; ++i) {
- br = post.content.indexOf('\n',br+1)
if br<0
- break
if br >150
- break
- }
if br < 0
.post-content
!= post.content
else
.post-content
!= post.content.substring(0, br)
From the code, it can be seen that the code is truncated based on '\n' as the breakpoint, and five lines of code are extracted, which happens to truncate the table tag.
The HTML content after truncation is shown below:
<div class="post-content">
<table>
<thead>
<tr>
<th>-</th>
<th>-</th>
</div>
<p class="readmore"><a href="/2020/01/21/hexo/hexo/">Read More</a></p>
Solution Approach#
Regarding this issue, I have come up with several solutions.
Display Text Only#
Process the blog post into plain text, ignore code blocks or images, and then extract a portion of the text as the summary to display.
Close Unclosed Tags#
Use an algorithm to close the truncated HTML tags. I have already implemented this functionality when fixing the bug, but I found that it is not as useful as I imagined. For example, the table tag may be closed even when there is no content displayed in a row. It is also possible that a table list with five lines can only display one line, and when clicked, it is found that there are several more lines. The advantage is that images and code blocks can be displayed as summaries.
Continue Matching if Not Closed#
This approach is to continue matching if a tag is not properly closed in the current truncated lines, until the entire article matches and closes the tag. I have also implemented this functionality when fixing the bug, but I found more issues. For example, img, br, and a tags are self-closing tags, or it may continue matching until the end of the article to properly close the tag. In addition, using regular expressions to match HTML tags is quite tricky, and I haven't found a correct rule yet. It may require further learning, but some people say that regular expressions cannot match HTML.
Final Solution#
Later, when referring to the practices of other themes, it seems that displaying only text is a good approach. Use the strip_html helper function provided by Hexo to handle document truncation.