if __name__ ==''__ main__'': tests = [ ''hello\\\goodbye \ nmy fish \ n'', ''hello \\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\ fish \''', ''hello\rgoodbye \ n'', '''', '' \\\\\\'n', ''\ n \ n \\\\\\\ n br /> ''\ n \ nn \\ r \\\ n', ''\ n \\\\\\\\\''', ] 参加测试: print repr(entry) print repr(find_ending(entry)) 打印

I''d count the number of occurences of ''\r\n'', ''\n'' without a preceding''\r'' and ''\r'' without following ''\n'', and let the majority decide.

Sybren--The problem with the world is stupidity. Not saying there should be acapital punishment for stupidity, but why don''t we just take thesafety labels off of everything and let the problem solve itself?
Frank Zappa

Sounds reasonable, edge cases for small files be damned. :-)

This is what I came up with. As you can see from the docstring, itattempts to sensible(-ish) things in the event of a tie, or no lineendings at all.Comments/corrections welcomed. I know the tests aren''t very useful(because they make no *assertions* they won''t tell you if it breaks),but you can see what''s going on :

import reimport osrn = re.compile(''\r\n'')r = re.compile(''\r(?!\n)'')n = re.compile(''(?<!\r)\n'')# Sequence of (regex, literal, priority) for each line endingline_ending = [(n, ''\n'', 3), (rn, ''\r\n'', 2), (r, ''\r'', 1)]def find_ending(text, default=os.linesep):"""Given a piece of text, use a simple heuristic to determine the lineending in use.Returns the value assigned to default if no line endings are found.This defaults to ``os.linesep``, the native line ending for themachine.If there is a tie between two endings, the priority chain is``''\n'', ''\r\n'', ''\r''``."""results = [(len(exp.findall(text)), priority, literal) forexp, literal, priority in line_ending]results.sort()print resultsif not sum([m[0] for m in results]):return defaultelse:return results[-1][-1]if __name__ == ''__main__'':tests = [''hello\ngoodbye\nmy fish\n'',''hello\r\ngoodbye\r\nmy fish\r\n'',''hello\rgoodbye\rmy fish\r'',''hello\rgoodbye\n'','''',''\r\r\r \n\n'',''\n\n \r\n\r\n'',''\n\n\r \r\r\n'',''\n\r \n\r \n\r'',]for entry in tests:print repr(entry)print repr(find_ending(entry))print

Fuzzyman http://www.voidspace.org.uk/python/index.shtml
08-16 07:59