本文介绍了读取大型机EBCDIC文件的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个EBCDIC编码的大型机文件,我需要将其转换为ASCII格式.我可以使用哪些库/工具来做到这一点.我对Python最熟悉.

I have a EBCDIC coded mainframe file which I need to convert to an ASCII format. Which libraries/tools can I use to do that. I am most familiar with Python.

我收到的文件带有一个食谱,可以用来解析文件(部分内容在下面).

The file I received has a cookbook with it, which can be used to parse the file (part of it is below).

类型:"C","P"和"B"是什么意思?我猜是C =字符,B =字节,P =打包数字?

What do types: 'C', 'P' and 'B' mean? I'm guessing C = character, B = byte, P = packed number?

1:----------------------------------------------------------------------------------------------------------------------------------:
 :LAYOUT NAME:         B224E           DATE:    02/20/14         PAGE   7 OF  14:
 :                     -------                  --------              ---    ---:
 :COBOL:  PAN-NAME: NONE                 COPYLIB-NAME: RECB224E                 :
 :                  --------------------               --------------------     :
 :BAL  :  PAN-NAME: NONE                 COPYLIB-NAME: NONE                     :
 :------------------------------------------------------------------------------:
 :TYPE OF RECORD:  EXTENDED SORT KEY AREA - SEGMENT "A"  (OPTIONAL)             :
 :------------------------------------------------------------------------------:
 :POSITION  : LENGTH : TYPE :   DESCRIPTION                                     :
 :----------:--------:------:---------------------------------------------------:
 :          :        :      :                                                   :
 :          :        :      :                                                   :
 :          :        :      :                                                   :
 :001 - 001 :    1   :   C  :  SEGMENT IDENTIFIER - "A"                         :
 :          :        :      :                                                   :
 :002 - 003 :    2   :   P  :  SEGMENT LENGTH                                   :
 :          :        :      :                                                   :
 :004 - ??? :   ???  :   C  :  EXTENDED SORT KEY AREA                           :
 :          :        :      :                                                   :

推荐答案

看看 codecs 模块.在标准编码表中,看起来EBCDIC也是众所周知的作为cp-500.像下面这样的东西应该起作用:

Take a look at the codecs module. From the standard encodings table, it looks like EBCDIC is also known as cp-500. Something like the following should work:

import codecs

with open("EBCDIC.txt", "rb") as ebcdic:
    ascii_txt = codecs.decode(ebcdic, "cp500")
    print(ascii_txt)

如注释中mpez0所示,如果您使用的是Python 3,则可以将代码压缩为此:

As mpez0 noted in the comments, if you're using Python 3, you can condense the code to this:

with open("EBCDIC.txt", "rt", "cp500") as ebcdic:
    print(ebcdic.read())

没有方便的EBCDIC文件,我无法测试,但这足以让您入门.

Not having an EBCDIC file handy, I can't test this, but it should be enough to get you started.

这篇关于读取大型机EBCDIC文件的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

07-08 23:51