本文介绍了使用正则表达式分离多部分电子邮件的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧! 问题描述 在你们告诉我正则表达是所有邪恶的缩影之前...我已经知道了。如果我有更多的头发,就会被撕掉了。 所以就这个问题。我使用正则表达式解析器,剥离了html电子邮件的所需部分。我为什么要这样做?因为我仍然是一个初学者程序员,如果你能建议一个更好的方式,那么一切手段...做。解析器在电子邮件的正常html部分完美工作,但是如果有人发送给我,并发送电子邮件,只需一个附件(或更多)... 所有HELL BREAKS LOOSE! 而不是得到正常的html电子邮件的样子,我得到的纯文本版本与html版本连接在一起,如下所示: --_ 1b4078c9-04f5-4cca-a220-e5b30eddef46_ 内容类型:text / plain; charset =iso-8859-1 Content-Transfer-Encoding:quoted-printable 至:**** @ **** = 3B *** * @ **** | Emmanuel Smith = 3B = Jonny Barnes cc:| bcc:|参考:Test123 --- Lorem ipsum dolor sit amet = 2C consectetur adipiscing elit。赞美在augu = e nec justo暂时变量eu et tellus。 Nunc id massa tortor = 2C ut lobortis = sem。 class conent taciti sociosqu ad litora torquent per conubia nostra = 2C = per insptos himenaeos。在nibh,maecenas quis nisl nec quam tristique posuere sed = 。 Cras fringilla vestibulum metus vel porttitor。 2 + 2 = 3D 7 Cras ia = culis = 2C erat nec gravida accumsan = 2C metus felis vestibulum risus = 2C quis = venenatis nisl nulla sed diam。 Aenean quis viverra velit。 Etiam quis massa = lectus = 2C faucibus facilisis sem。 Cur ur。。 Sed在ligula = neque。 Don ec。。 Cur u u。 Phasel = lus auctor odio dolor = 2C ut ornare augue。 Suspendisse vel est nibh。 Vivamus = facilisis placerat augue sit amet aliquam。 Maecenas viverra = 2C ipsum a tin = cidunt elementum = 2C arcu tellus rutrum ipsum = 2C et dignissim urna orci ac m = i。 Vivamus非odio马萨Nulla congue massa eu leo pretium nonuredat = urna molestie。 整数neque odio = 2C scelerisque at molestie quis = 2C congue sed arcu。 Prae = 发送了一个arcu odio。 Donec sollicitudin = 2C quam vel tincidunt lobortis = 2C urna = augue cursus lorem = 2C in eleifend nunc risus nec neque。 Donec euismod maur = 是非nibh blandit sollicitudin。 Vivamus sed tincidunt augue。 Suspendisse = iaculis massa ut tellus rutrum auctor。在viv = erra中的Cras venenatis结果urna。 Ut blandit imperdiet dolor non scelerisque。 Suspendisse电位Sed = vitae lacus ac odio euismod tempus。 Aenean ut sem odio。 Curabitur auctor pu = rus a diam iaculis facilisis。整数mol ger。。。 = Nunc aliquet tempus orci sit amet viverra。 = 20 Hotmail正在重新定义忙碌的新忙碌工具。从您的= 收件箱中获取更多信息。怎么看。 = 20 _________________________________________________________________ 新忙不是老忙。搜索= 2C来自您的收件箱的聊天和电子邮件= .. http://www.windowslive.com/campaign/thenewbusy?ocid=3DPID28326::T:WLMTAGL:O= N:WL:en-US:WM_HMP:042010_3 = --_ 1b4078c9-04f5-4cca-a220-e5b30eddef46_ 内容类型:text / html; charset =iso-8859-1 Content-Transfer-Encoding:quoted-printable < html> < head> < style><! - ..hmmessage P { margin:0px = 3B padding:0px } body.hmmessage { font-size:10pt = 3B font-family:Verdana } - >< / style> < / head> < body class = 3D'hmmessage'> 至:**** @ **** **** @ **** | Emmanuel Smith = 3B = Jonny Barnes< div>< div>< div> ; d = iv>< br>< span class = 3Decxecxecxecxecxecxecxecxecxecxecxecxecxecxecxecxecxec = xecxApple-style-spanstyle = 3Dfont-family:Tahoma = 2C Verdana = 2C Arial = 2C sa = ns-serif = 3Bcolor:rgb(68 = 2C 68 = 2C 68)>< font class = 3Decxecxecxecxecxecxecxe = cxecxecxecxecxecxecxecxApple-style-spancolor = 3D#000000 >< font class = 3De = cxecxecxecxecxecxApple-style-spanface = 3DVerdana> ---< br>< / font>< / font>< iv>< font class = 3DecxecxecxecxecxecxApple-style-spanface = 3DVerdana>< br> = < / font>< / div>< div> < span class = 3Decxecxecxecxecxecxecxecxecxecxecxecxecxecx = ecxecxecxecxecxecxecxecxecxecxecxecxecxecxecxecxecxecxecxecxecxecxApple-style-spanstyle = 3Dfont-s = ize:11px = 3Bline-height:14px>< font class = 3D ecxecxecxecxecxecxApple-style-s = panface = 3DVerdana > Lorem ipsum dolor sit amet = 2C consectetur adipiscing = elit。在刚刚的时候, Nunc id massa = tortor = 2C ut lobortis sem。每个conubia nostra = 2C每个inceptos himenaeos类别适合taciti sociosqu ad litora torquent = Ma 。。。am am am am。。。。。。。。。。。。。 Cras fringilla vestibulum metus vel porttito = r。 2 + 2 = 3D 7 Cras iaculis = 2C erat nec gravida accumsan = 2C metus felis ves = tibulum risus = 2C quis venenatis nisl nulla sed diam。 Aenean quis viverra ve = lit. Etiam quis massa lectus = 2C faucibus facilisis sem。 Curabitur non eros = tellus。 Sed在ligula neque。 Don ec。。 Curabitur eu = accumsan erat。豌豆or io olor olor olor olor。。。。。。。 Suspendisse = vel est nibh。 Vivamus facilisis placerat augue sit amet aliquam。 Maecenas = viverra = 2C ipsum a tincidunt elementum = 2C arcu tellus rutrum ipsum = 2C et di = gnissim urna orci ac mi。 Vivamus非odio马萨Nulla congue massa eu leo = preium nonurtat urna molestie。< / font>< / span>< / div>< div>< span class = 3D = ecxecxecxecxecxecxecxecxecxecxecxecxecxecxecxecxecxec = xecxecxecxApple-style-spanstyle = 3Dfont-size:11px = 3Bline-height:14px>< fo = nt class = 3DecxecxecxecxecxecxApple-style-spanface = Verdana>< br>< / font = >< / span>< / div>< div>< span class = 3Decxecxecxecxecxecxecxecxecxecxecxecxecxec = xecxecxecxecxecxecxecxecxApple- spanstyle = 3Dfont- = size:11px = 3Bline-height:14px>< font class = 3DecxecxecxecxecxecxApple-style- = spanface = 3DVerdana> ;< br>< / font>< / span>< / div>< div>< span class = 3Decxecxec = xecxecxecxecxecxecxecxecxecxecxecxecxecxecxecxecxecxecxecxecxecxecxecxecxec = xApple-style-span = 3Dfont-size:11px = 3Bline-height:14px>< font class = = 3DecxecxecxecxecxecxApple-sty le-spanface = 3DVerdana>< br>< / font>< / span> = < / div>< div>< font class = -spanface = 3DVerdanasize = 3D3>< sp = an class = 3DApple-style-spanstyle = 3Dfont-size:11px = 3B line-height :14px = = 3B>< br>< / span>< / font>< / div>< span class = 3Decxecxecxecxecxecxecxecxecxecxe = cxecxecxecxecxecxecxecxecxecxecxecxecxecxecxecxecxecxec = cxecxecxecxecxApple-style-spanstyle = 3Dfont-family:Arial = 2C Helvetica = 2C = sans = 3Bfont-size:11px>< p style = 3Dmargin-right:0px = 3Bmargin-底部:14px = 3B = margin-left:0px = 3Btext-align:justify = 3Bfont-size:11px = 3Bline-height:14px = 3B = padding-top:0px = 0px = 3Bpadding-bottom:0px = 3Bpadding-left:0px = >< font class = 3DecxecxecxecxecxecxApple-style-spanface = 3DVerdana> Integ = er neque odio = 2C scelerisque at molestie quis = 2C congue sed arcu。 Praesent = a arcu odio。 Donec sollicitudin = 2C quam vel tincidunt lobortis = 2C urna augu = e cursus lorem = 2C in eleifend nunc risus nec neque。 Donec euismod mauris no = n nibh blandit sollicitudin。 Vivamus sed tincidunt augue。 Suspendisse iacul = 是massa ut tellus rutrum auctor。在viverra中的cras venenatis结果= Ut blandit imperdiet dolor non scelerisque。 Suspendisse电位Sed vitae = lacus ac odio euismod tempus。 Aenean ut sem odio。 。。us us us a a。。。。。。。。。。。。。。整数mol ger。。。 Nunc = aliquet tempus orci sit amet viverra。< / font>< / p>< p style = 3Dmargin-right:0p = x = 3Bmargin-bottom:14px = 3Bmargin-left :0px = 3Btext-align:justify = 3Bfont-size:1 = 1px = 3Bline-height:14px = 3Bpadding-top:0px = 3Bpadding-right:0px = 3Bpadding-bott = om:0px = 3Bpadding-left:0px>< font class = 3DecxecxecxecxecxecxApple-style-spa = nface = 3DVerdana>< br>< / font>< / p>< ; p style = 3Dmargin-right:0px = 3Bmargin-bo = ttom:14px = 3Bmargin-left:0px = 3Btext-align:justify = 3Bfont-size:11px = 3Bline-he = ight:14px = 3Bpadding-top:0px = 3Bpadding-right:0px = 3Bpadding-bottom:0px = 3Bpadd = ing-left:0px>< font class = 3DApple-style-span = 3DVerdana>< br>< / font> = < / p>< / span>< / span>< / div> < br>< hr> Hotmail正在重新定义忙碌用于新忙碌的= 工具。从您的收件箱获取更多信息< a href = 3Dhttp://www.wi= ndowslive.com/campaign/thenewbusy?ocid=3DPID28326::T:WLMTAGL:ON:WL:en-US:WM= _HMP:042010_2>查看方式< / a> < br />< hr />新忙不是= 旧忙。从收件箱搜索= 2C聊天和电子邮件。 < a href = 3D'http://www.= windowslive.com/campaign/thenewbusy?ocid=3DPID28326::T:WLMTAGL:ON:WL:en-US:= WM_HMP: 042010_3'target = 3D'_new'>开始。< / a>< / body> < / html> = --_ 1b4078c9-04f5-4cca-a220-e5b30eddef46 _-- 所以我的问题是...如何使用正则表达式(或更简单的方法)将HTML版本与文本版本分开? http://mimeparser.codeplex.com/ http://anmar.eu.org/projects/sharpmimetools/ http://www.codeproject.com/KB/cs/MIME_De_Encode_in_C_.aspx 最后两个有点旧。如果他们不容易编译,他们的来源可能会指向正确的方向。 请记住,电子邮件可以包含附件,其中包含附件的电子邮件,等等...在某些时候,正则表达式会让你失望。 Before you guys go telling me that Regex is the epitome of all evil... I already know. If I had more hair it would be ripped out already.So onto the question. I have made a parser using regex that strips out the desired parts of an html email. Why on earth would I want to do that? Because I'm still a beginner programmer ok, if you can suggest a better way then by all means... do. The parser works perfectly on normal html parts of an email, however if someone sends me and email with just one attachment (or more)...ALL HELL BREAKS LOOSE!Instead of getting what a normal html email looks like, I get the plain text version with the html version concatenated onto the end like so:--_1b4078c9-04f5-4cca-a220-e5b30eddef46_Content-Type: text/plain; charset="iso-8859-1"Content-Transfer-Encoding: quoted-printableTo: ****@****=3B ****@**** | Emmanuel Smith=3B= Jonny Barnescc: |bcc: |Ref: Test123---Lorem ipsum dolor sit amet=2C consectetur adipiscing elit. Praesent in augu=e nec justo tempor varius eu et tellus. Nunc id massa tortor=2C ut lobortis= sem. Class aptent taciti sociosqu ad litora torquent per conubia nostra=2C= per inceptos himenaeos. Maecenas quis nisl nec quam tristique posuere sed =at nibh. Cras fringilla vestibulum metus vel porttitor. 2 + 2 =3D 7 Cras ia=culis=2C erat nec gravida accumsan=2C metus felis vestibulum risus=2C quis =venenatis nisl nulla sed diam. Aenean quis viverra velit. Etiam quis massa =lectus=2C faucibus facilisis sem. Curabitur non eros tellus. Sed at ligula =neque. Donec elementum rhoncus volutpat. Curabitur eu accumsan erat. Phasel=lus auctor odio dolor=2C ut ornare augue. Suspendisse vel est nibh. Vivamus= facilisis placerat augue sit amet aliquam. Maecenas viverra=2C ipsum a tin=cidunt elementum=2C arcu tellus rutrum ipsum=2C et dignissim urna orci ac m=i. Vivamus non odio massa. Nulla congue massa eu leo pretium non consequat =urna molestie.Integer neque odio=2C scelerisque at molestie quis=2C congue sed arcu. Prae=sent a arcu odio. Donec sollicitudin=2C quam vel tincidunt lobortis=2C urna= augue cursus lorem=2C in eleifend nunc risus nec neque. Donec euismod maur=is non nibh blandit sollicitudin. Vivamus sed tincidunt augue. Suspendisse =iaculis massa ut tellus rutrum auctor. Cras venenatis consequat urna in viv=erra. Ut blandit imperdiet dolor non scelerisque. Suspendisse potenti. Sed =vitae lacus ac odio euismod tempus. Aenean ut sem odio. Curabitur auctor pu=rus a diam iaculis facilisis. Integer molestie commodo mauris a imperdiet. =Nunc aliquet tempus orci sit amet viverra. =20Hotmail is redefining busy with tools for the New Busy. Get more from your =inbox. See how. =20_________________________________________________________________The New Busy is not the old busy. Search=2C chat and e-mail from your inbox=..http://www.windowslive.com/campaign/thenewbusy?ocid=3DPID28326::T:WLMTAGL:O=N:WL:en-US:WM_HMP:042010_3=--_1b4078c9-04f5-4cca-a220-e5b30eddef46_Content-Type: text/html; charset="iso-8859-1"Content-Transfer-Encoding: quoted-printable<html><head><style><!--..hmmessage P{margin:0px=3Bpadding:0px}body.hmmessage{font-size: 10pt=3Bfont-family:Verdana}--></style></head><body class=3D'hmmessage'>To: ****@**** ****@**** | Emmanuel Smith=3B= Jonny Barnes<br><div>cc: |</div><div>bcc: |</div><div>Ref: Test123</div><d=iv><br><span class=3D"ecxecxecxecxecxecxecxecxecxecxecxecxecxecxecxecxecxec=xecxApple-style-span" style=3D"font-family:Tahoma=2C Verdana=2C Arial=2C sa=ns-serif=3Bcolor:rgb(68=2C 68=2C 68)"><font class=3D"ecxecxecxecxecxecxecxe=cxecxecxecxecxecxecxecxApple-style-span" color=3D"#000000"><font class=3D"e=cxecxecxecxecxecxApple-style-span" face=3D"Verdana">---<br></font></font><d=iv><font class=3D"ecxecxecxecxecxecxApple-style-span" face=3D"Verdana"><br>=</font></div><div><span class=3D"ecxecxecxecxecxecxecxecxecxecxecxecxecxecx=ecxecxecxecxecxecxecxecxecxecxecxecxecxecxApple-style-span" style=3D"font-s=ize:11px=3Bline-height:14px"><font class=3D"ecxecxecxecxecxecxApple-style-s=pan" face=3D"Verdana">Lorem ipsum dolor sit amet=2C consectetur adipiscing =elit. Praesent in augue nec justo tempor varius eu et tellus. Nunc id massa= tortor=2C ut lobortis sem. Class aptent taciti sociosqu ad litora torquent= per conubia nostra=2C per inceptos himenaeos. Maecenas quis nisl nec quam =tristique posuere sed at nibh. Cras fringilla vestibulum metus vel porttito=r. 2 + 2 =3D 7 Cras iaculis=2C erat nec gravida accumsan=2C metus felis ves=tibulum risus=2C quis venenatis nisl nulla sed diam. Aenean quis viverra ve=lit. Etiam quis massa lectus=2C faucibus facilisis sem. Curabitur non eros =tellus. Sed at ligula neque. Donec elementum rhoncus volutpat. Curabitur eu= accumsan erat. Phasellus auctor odio dolor=2C ut ornare augue. Suspendisse= vel est nibh. Vivamus facilisis placerat augue sit amet aliquam. Maecenas =viverra=2C ipsum a tincidunt elementum=2C arcu tellus rutrum ipsum=2C et di=gnissim urna orci ac mi. Vivamus non odio massa. Nulla congue massa eu leo =pretium non consequat urna molestie.</font></span></div><div><span class=3D="ecxecxecxecxecxecxecxecxecxecxecxecxecxecxecxecxecxecxecxecxecxecxecxecxec=xecxecxecxApple-style-span" style=3D"font-size:11px=3Bline-height:14px"><fo=nt class=3D"ecxecxecxecxecxecxApple-style-span" face=3D"Verdana"><br></font=></span></div><div><span class=3D"ecxecxecxecxecxecxecxecxecxecxecxecxecxec=xecxecxecxecxecxecxecxecxecxecxecxecxecxecxApple-style-span" style=3D"font-=size:11px=3Bline-height:14px"><font class=3D"ecxecxecxecxecxecxApple-style-=span" face=3D"Verdana"><br></font></span></div><div><span class=3D"ecxecxec=xecxecxecxecxecxecxecxecxecxecxecxecxecxecxecxecxecxecxecxecxecxecxecxecxec=xApple-style-span" style=3D"font-size:11px=3Bline-height:14px"><font class==3D"ecxecxecxecxecxecxApple-style-span" face=3D"Verdana"><br></font></span>=</div><div><font class=3D"Apple-style-span" face=3D"Verdana" size=3D"3"><sp=an class=3D"Apple-style-span" style=3D"font-size: 11px=3B line-height: 14px==3B"><br></span></font></div><span class=3D"ecxecxecxecxecxecxecxecxecxecxe=cxecxecxecxecxecxecxecxecxecxecxecxecxecxecxecxecxecxecxecxecxecxecxecxecxe=cxecxecxecxecxApple-style-span" style=3D"font-family:Arial=2C Helvetica=2C =sans=3Bfont-size:11px"><p style=3D"margin-right:0px=3Bmargin-bottom:14px=3B=margin-left:0px=3Btext-align:justify=3Bfont-size:11px=3Bline-height:14px=3B=padding-top:0px=3Bpadding-right:0px=3Bpadding-bottom:0px=3Bpadding-left:0px="><font class=3D"ecxecxecxecxecxecxApple-style-span" face=3D"Verdana">Integ=er neque odio=2C scelerisque at molestie quis=2C congue sed arcu. Praesent =a arcu odio. Donec sollicitudin=2C quam vel tincidunt lobortis=2C urna augu=e cursus lorem=2C in eleifend nunc risus nec neque. Donec euismod mauris no=n nibh blandit sollicitudin. Vivamus sed tincidunt augue. Suspendisse iacul=is massa ut tellus rutrum auctor. Cras venenatis consequat urna in viverra.= Ut blandit imperdiet dolor non scelerisque. Suspendisse potenti. Sed vitae= lacus ac odio euismod tempus. Aenean ut sem odio. Curabitur auctor purus a= diam iaculis facilisis. Integer molestie commodo mauris a imperdiet. Nunc =aliquet tempus orci sit amet viverra.</font></p><p style=3D"margin-right:0p=x=3Bmargin-bottom:14px=3Bmargin-left:0px=3Btext-align:justify=3Bfont-size:1=1px=3Bline-height:14px=3Bpadding-top:0px=3Bpadding-right:0px=3Bpadding-bott=om:0px=3Bpadding-left:0px"><font class=3D"ecxecxecxecxecxecxApple-style-spa=n" face=3D"Verdana"><br></font></p><p style=3D"margin-right:0px=3Bmargin-bo=ttom:14px=3Bmargin-left:0px=3Btext-align:justify=3Bfont-size:11px=3Bline-he=ight:14px=3Bpadding-top:0px=3Bpadding-right:0px=3Bpadding-bottom:0px=3Bpadd=ing-left:0px"><font class=3D"Apple-style-span" face=3D"Verdana"><br></font>=</p></span></span></div> <br><hr>Hotmail is redefining busy with= tools for the New Busy. Get more from your inbox. <a href=3D"http://www.wi=ndowslive.com/campaign/thenewbusy?ocid=3DPID28326::T:WLMTAGL:ON:WL:en-US:WM=_HMP:042010_2">See how.</a> <br /><hr />The New Busy is not the =old busy. Search=2C chat and e-mail from your inbox. <a href=3D'http://www.=windowslive.com/campaign/thenewbusy?ocid=3DPID28326::T:WLMTAGL:ON:WL:en-US:=WM_HMP:042010_3' target=3D'_new'>Get started.</a></body></html>=--_1b4078c9-04f5-4cca-a220-e5b30eddef46_--So my question is... How can I separate the html version from the text version using regex (or by easier means)? 解决方案 There are a few open source C# MIME parsers available:http://mimeparser.codeplex.com/http://anmar.eu.org/projects/sharpmimetools/http://www.codeproject.com/KB/cs/MIME_De_Encode_in_C_.aspxThe last two are a bit old. If they don't easily compile, their source might point you in the right direction.Remember, an email can contain an attachment that is an email that contains an attachment, etc, etc... At some point, Regex will let you down. 这篇关于使用正则表达式分离多部分电子邮件的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持! 10-15 02:09