本文介绍了在 Microsoft Word 中将文本转换为图像的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一本用 Microsoft Word 编写的大书,我想创建一个宏,该宏将使用预定义的样式查找所有文本并将该文本转换为内嵌图像.此文本为阿拉伯语,一般不超过 4-5 行.这可能吗?

I have a large book written in Microsoft Word and want to create a macro that will find all text using a predefined style and convert that text to an inline image. This text will be in Arabic and generally no longer than 4-5 lines. Is this possible?

更新:这里有一个例子来说明我所指的内容:

UPDATE: Here's an example to show what I'm referring to:

我想用图像替换阿拉伯语中的整行(好像我裁剪了此附加图像以仅包含阿拉伯语,然后用图像替换了阿拉伯语中的行).

I want to replace that entire line in Arabic with an image (as if I cropped this attached image to only include the Arabic and then replaced the line in Arabic with the image).

我想要一个宏或脚本来做这件事的原因是因为有数百行这样的行,逐行更新它们很麻烦,而且以后会很难修改.

The reason I want a macro or script to do this is because there are hundreds of such lines and updating them one by one is cumbersome plus that will make modifications difficult later on.

UPDATE2:我在这里找到了一个有趣的选项:http://windowssecrets.com/forums/showthread.php/31344-Convert-Text-to-an-Image-of-Text-in-VBA-(Office-2000-Sr1a)

UPDATE2: I found an interesting option here: http://windowssecrets.com/forums/showthread.php/31344-Convert-Text-to-an-Image-of-Text-in-VBA-(Office-2000-Sr1a)

看起来您可以剪切一段文本,然后选择性粘贴"作为图像.因此,如果有一种方法可以实现自动化,那可能会奏效.

It looks like you can cut a piece of text and then "Paste Special" as an image. So if there's a way to automate that that might work.

推荐答案

这不是一个答案,尽管我希望它会成为一个社区的答案.目前正在探索解决问题所需的条件.

This is not an answer although I hope it will grow into a community answer. At the moment it is an exploration of what is required to solve the problem.

当这个问题在超级用户上发布时的讨论中,我知道阿卜杜拉希望在 Kindle 上出版他的书.所以问题实际上是关于如何准备好以电子书形式出版的英语和阿拉伯语文档.

I know from the discussion when this question was posted on Super User that Abdullah wishes to publish his book on Kindle. So the question is really about how to get a document in English and Arabic ready for publication as an e-Book.

Kindle 不支持阿拉伯语.它支持的语言数量正在缓慢增加,但没有证据表明亚马逊计划在可预见的未来添加阿拉伯语.

The Kindle does not support Arabic. The number of languages it does support is slowly increasing but there is no evidence I can find that Amazon has plans to add Arabic in the foreseeable future.

亚马逊电子书背后的格式是 HTML 的精简版.如果将包含阿拉伯字母的 Word 文档导出为 HTML,则将阿拉伯字母作为字符实体包含在内;例如:ﭐ&#amp;64337;ﭒﭓ".将原始 Word 或 HTML 版本导入 Kindle,导致前导位被丢弃,因此这些字符显示为 P、Q、R 和 S,而不是ﭐ ﭑ ﭒ ﭓ"(Alef Wasla 孤立形式、Alef Wasla 最终形式、BeehWasla 分离形式和 Beeh Wasla 最终形式).

The format behind an Amazon e-Book is a cut down version of HTML. If a Word document containing Arabic letters is exported to HTML, the Arabic letters are included as character entities; for example: "ﭐ &#amp;64337; ﭒ ﭓ". Importing the original Word or the HTML version to Kindle, results in the leading bits being discarded so these characters are displayed as P, Q, R and S instead of "ﭐ ﭑ ﭒ ﭓ (Alef Wasla isolated form, Alef Wasla final form, Beeh Wasla isolated form and Beeh Wasla final form).

我尝试了 Abdullah 的想法,即在 PNG 文件中保存一些阿拉伯字母并创建一个包含 <p> 的 HTML 文件.... </p><img src="Arabic.png" ><p>... </p>.这个文件在我的 Kindle 2 上的外观是完全可以接受的,所以这有可能成为一个解决方案.问题是:如何进行必要的转换?

I have tried Abdullah’s idea of saving some Arabic letters in a PNG file and creating an HTML file containing <p> … </p> <img src= "Arabic.png" > <p> … </p>. The appearance of this file on my Kindle 2 is perfectly acceptable so this has the potential to be a solution. The question is: how can the necessary conversions be performed?

我们需要从 Word 文档或其 HTML 等效文件中提取每个阿拉伯字符串,并将其导入到可以将它们转换为 PNG 文件的程序中.

We need to extract each Arabic string from either the Word document or its HTML equivalent and import it into a program that can convert them to PNG files.

我所知道的自动执行此操作的唯一方法是将每个字符串复制到 PowerPoint 中的幻灯片.使用 PowerPoint 的另存为选项,可以将每张幻灯片另存为单独的 PNG 文件.幻灯片按顺序命名为:SLIDE1.PNG、SLIDE2.PNG、SLIDE3.PNG 等,这将允许宏将结果与原始字符串相关联.然后就可以用图像元素替换 HTML 文件中的阿拉伯字符串.这一切都不会太难自动化,但幻灯片的大小都是 PowerPoint 页面的大小存在问题.页面可以做得更小,但我们需要的是将每张幻灯片裁剪为仅大于该幻灯片的文本.我想不出任何自动裁剪的方法.

The only way that I know of automating this would be to copy each string to a slide within PowerPoint. With PowerPoint’s SaveAs option it is possible to save each slide as a separate PNG file. The slides are named: SLIDE1.PNG, SLIDE2.PNG, SLIDE3.PNG and so on in sequence which would allow a macro to relate the results to the original strings. It would then be possible to replace the Arabic strings in the HTML file with the image elements. None of this would be too difficult to automate but there is a problem with the slides all being the size of the PowerPoint page. The page could be made smallish but what we need is for each slide to be cropped to just bigger than that slide’s text. I cannot think of any way of automating this cropping.

有没有人有比将每个阿拉伯语短语转换为 PNG 文件更好的方法?

Does anyone have a better approach than converting each Arabic phrase to a PNG file?

我一直在寻找具有某种命令行界面的 PNG 编辑器,但找不到比使用 PowerPoint 更容易的东西.有人知道 PowerPoint 的替代品吗?

I have been looking for PNG editors with some sort of command line interface but can find nothing that would be easier than using PowerPoint. Does anyone know of an alternative to PowerPoint?

有人对自动裁剪每张图像有什么建议吗?当在 PowerPoint 幻灯片中放置一个字符串时,可以将其宽度设置为 6.5 厘米(这在我的 Kindle 上看起来不错)并获得由 PowerPoint 确定的高度.如果有人知道如何使用它,这可以保存以备后用.

Does anyone have any suggestions for automating the cropping of each image? When a string is placed in a PowerPoint slide it is possible to set its width to, say, 6.5cm (which looks good on my Kindle) and get the height determined by PowerPoint. This could be saved for later use if anyone knows how to use it.

实施解决方案

有待改进上述方法的任何建议,以下概述了我将如何实施它.

Pending any suggestions for improving the approach described above, the following outlines how I would implement it.

我不会尝试处理 Word 文档.我会将其另存为 Web Page, Filtered HTML 文件,这是创建 Kindle 电子书并进行处理的必要步骤.

I would not attempt to process the Word document. I would save it as a Web Page, Filtered HTML file, which is a required step on the way to creating a Kindle eBook, and process that.

在根据我的测试文档创建的 HTML 文件中,阿拉伯语短语显示为:

Within the HTML file created from my test document, the Arabic phrase comes out as:

<p class="MsoNormal"></p>
<p class="MsoNormal" align="center" style="text-align:center"><span dir="RTL"
style="font-size:24.0pt;font-family:Arial">
&amp;#64336;&amp;#64337;&amp;#64338;&amp;#64339;&amp;#64340;&amp;#64341;
&amp;#64342;&amp;#64343;&amp;#65153;&amp;#65154;&amp;#65276;&amp;#65275;
&amp;#65274;&amp;#65273;&amp;#65246;&amp;#65226;&amp;#65227;&amp;#65228;
</span><span style="font-size:24.0pt"></span></p>
<p class="MsoNormal"></p>
<p class="MsoNormal"></p>

我认为阿卜杜拉的文件会产生类似的结果.注1:以上是随机收集的阿拉伯字母.注 2:它们在阅读顺序中是从左到右保持的,即使在显示或打印时,它们是从右到左阅读的.

I assume Abdullah's document will result in something similar. Note 1: the above is a random collection of Arabic letters. Note 2: they are held left-to-right in reading sequence even though, when displayed or printed, they are read right-to-left.

这个块的整个将被替换为:

The whole of this block will have to be replaced with something like:

<br><imc src="xxxx.png"><br>

文件 xxxx.png 包含阿拉伯文本的图像.

where the file xxxx.png holds an image of the Arabic text.

文件名,例如 xxxx.png,可能是系统的(A001.png、A002.png、...),但我认为将短语的前十或二十个字符从阿拉伯语音译为英语字母并使用结果,加上数字后缀,作为文件名会更方便.

The file names, such as xxxx.png, could be systematic (A001.png, A002.png, ...) but I would have thought that transliterating the first ten or twenty characters of the phrase from the Arabic to English alphabets and using the result, with a numeric suffix, as the file name would be more convenient.

我会在 Excel 工作表中保存管理流程所需的记录.我会将 VBA 代码放在同一个工作簿中.

I would hold the records necessary to manage the process in an Excel worksheet. I would place the VBA code in the same workbook.

我设想的转换过程的步骤是:

The steps in the conversion process that I envisage are:

  1. VBA 宏用于从最新的 HTML 文件中提取阿拉伯语字符串并将新字符串添加到 Excel 工作表.(稍后会详细介绍 Excel 工作表.)
  2. 用于创建 PowerPoint 文件的 VBA 宏,每个新字符串包含一张幻灯片,并使用 SaveAs 以 PNG 格式为每张幻灯片创建一个 PNG 文件,然后丢弃 PowerPoint 文件.
  3. 人工裁剪每个 PNG 文件.(似乎无法自动裁剪,因此将通过使用 Excel 工作表中的数据最小化此任务.)
  4. VBA 宏将每张幻灯片从 SLIDEnnn.PNG 重命名为其永久名称,并在 Excel 工作表中记录永久名称.
  5. VBA 宏通过用适当的 HTML IMG 元素替换包含阿拉伯语短语的块来更新最新的 HTML 文件.
  1. VBA macro to extract Arabic strings from latest HTML file and add new strings to the Excel worksheet. (More about the Excel worksheet later.)
  2. VBA macro to create PowerPoint file, with one slide per new string, and use SaveAs in PNG format to create one PNG file per slide before discarding the PowerPoint file.
  3. Human to crop each PNG file. (There appears to be no way of automating the cropping so this task will be minimised by use of data in the Excel worksheet.)
  4. VBA macro to rename each slide from SLIDEnnn.PNG to its permanent name and to record the permanent name in the Excel worksheet.
  5. VBA macro to update the latest HTML file by replacing the block containing the Arabic phrase with the appropriate HTML IMG element.

Excel 工作表需要两列:阿拉伯语短语和 PNG 文件名.如果存在工作表在第 2 步和第 4 步之间排序的风险,我们可能还需要一个序列号.

The Excel worksheet needs two columns: Arabic phrase and PNG file name. If there is any risk of the worksheet being sorted between steps 2 and 4, we may need a sequence number as well.

宏 1 将从 HTML 文件中提取一个阿拉伯语短语,在工作表中查找该短语的列表,如果该短语尚不存在,则在底部添加该短语.

Macro 1 will extract an Arabic phrase from the HTML file, look down the list in the worksheet for this phrase and add the phrase at the bottom if it is not already present.

Macro 2 将在工作表中查找没有 PNG 文件名的短语.这些新短语是要写入 PowerPoint 演示文稿的短语.也就是说,一个短语只进入这个过程一次.

Macro 2 will look for phrases in the worksheet that do not have a PNG file name. These new phrases are the ones to be written to the PowerPoint presentation. That is, a phrase only goes into this process once.

任务 3,裁剪每个 PNG 文件,会很痛苦.我只能说每个词组只会出现一次.

Task 3, cropping each PNG file, will be a pain. All I can say is that it will only be once per phrase.

Macro 4 将假定 SLIDE001.PNG、SLIDE002.PNG、... 在工作表中没有 PNG 文件的短语序列中.如果这可能不正确(因为工作表已排序),我们将需要序列号或保留 PowerPoint 文件.该宏将为每个新短语分配一个唯一名称,在工作表中记录此名称并重命名 PNG 文件.

Macro 4 will assume that the SLIDE001.PNG, SLIDE002.PNG, … are in the sequence of phrases without PNG files in the worksheet. If this might not be true (because the worksheet has been sorted) we will either need a sequence number or to retain the PowerPoint file. The macro will assign a unique name to each new phrase, record this name in the worksheet and rename the PNG file.

Macro 5 使用工作表的内容创建最新 HTML 文件的新副本,以确定用哪个 PNG 文件替换哪个短语.

Macro 5 creates a new copy of the latest HTML file using the contents of the worksheet to determine which phrase to replace with which PNG file.

这个过程并不理想,但会达到预期的效果,并且没有明显的并发症.有什么改进建议吗?

This process is not ideal but it will achieve the desired result and has no obvious complications. Any suggestions for improving it?

这篇关于在 Microsoft Word 中将文本转换为图像的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

08-23 01:07