本文介绍了在计算字母时,Haskell可以击败C吗?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧! 问题描述 剧透:是的。见下面。 试图优化一个字母计数器来匹配C.我已经把它打到了2倍的赤字。 letterCount :: B.ByteString - > V.Vector Int letterCount bs = V.accumulate (\ a _ - > a + 1)(V.replicate 256 0) letters1 where len = B.length bs letters1 = V.generate len(\i - >(fromIntegral $!B.index bs i,())) 一些注意事项: 在我将 Data.Vector 更改为 Data.Vector.Unboxed 之前,它非常慢。为什么会这样? 我认为大部分时间都会花在 accumulate 上。我错了。 70%的时间用于 generate 。 Haskell代码遭受不必要地将Word8转换为Int;此外,()的无用部队可能实际上或可能不会实际创建。 导入限定的Data.ByteString为B 导入限定的Data.Vector.Unboxed为V import System.Environment import Text.Printf letterCount :: B.ByteString - > V.Vector Int letterCount bs = V.accumulate (\ a _ - > a + 1)(V.replicate 256 0) letters1 where len = B.length bs letters1 = V.generate len(\i - >(fromIntegral $!B.index bs i,())) printCounts :: V.Vector Int - > IO() printCounts cs = mapM_ (uncurry $ printf%c:%d\\\)(zip(映射到Enum [0..255]: :String)(V.toList cs)) main :: IO() main = do filename< - fmap head getArgs f< - B .readFile文件名 let counts = letterCount f printCounts counts 竞争C代码: #include< assert.h> #include< stdio.h> #include< string.h> #include< sys / stat.h> #include< stdlib.h> int letcnt [256]; int * letter_count(unsigned char * s,unsigned int len) { int i; memset(letcnt,0,256 * sizeof(int)); for(i = 0; i< len; i ++){ letcnt [*(s + i)] ++; } return(letcnt); } void print_counts(){ int i; for(i = 0; i printf('%c':%d \ n,(unsigned char)i,letcnt [i]); $ b $ // st_size int main(int argc,char ** argv) { assert(argc == 2); FILE * f = fopen(argv [1],r); struct stat st; stat(argv [1],& st); off_t len = st.st_size; unsigned char * contents = calloc(len,1); fread(内容,len,1,f); fclose(f); letter_count(contents,len); print_counts(); 返回0; } 定时; $ time ./a.out / usr / share / dict / words> / dev / null real 0m0.012s 用户0m0.005s sys 0m0.005s $ time ./lettercount / usr / share /字典/单词> / dev / null real 0m0.017s user 0m0.009s sys 0m0.007s 更新 我认为性能上限取决于此错误: runST不是免费的。并不是说我相信不可能进一步优化,但只要runST带来一些开销,就不太可能接近C. 另外,根据@ Zeta的评论修正了C代码。 / p> 解决方案是的。如果使用 -fllvm 编译,则Haskell将匹配 User 时间中的C。令人惊讶的是,如果你切换到Lazy Bytestrings,那么Haskell版本将在 Real 时间的C版本上以小幅度但显着的优势胜过。 将限定的Data.ByteString.Lazy.Char8导入为B 将限定的Data.Vector.Unboxed导入为V import System.Environment import Text.Printf letterCount :: B.ByteString - > V.Vector Int letterCount bs = V.unsafeAccumulate (\ a_ - > a + 1)(V.replicate 256 0)(解析bs) 解析:: B.ByteString - > V.Vector(Int,()) parse = V.unfoldr step where step s = if B.null s then Nothing else Just(( fromIntegral。fromEnum $ B.head s,()),B.tail s) { - #INLINE parse# - } printCounts :: V.Vector Int - > IO() printCounts cs = mapM_ (uncurry $ printf%c:%d\\\)(zip(映射到Enum [0..255]: :String)(V.toList cs)) main :: IO() main = do filename< - fmap head getArgs f< - B .readFile文件名 let counts = letterCount f printCounts counts 请记住编译例如: ghc -O2 -fllvm letterCount.hs C.我喜欢它! 更新 公平地更新了C代码以使用单个缓冲区,避免了预先大量分配(或任何分配),并且对缓存更友好。现在,Haskell和C代码没有显着差异,对于大型的150M输入文件,运行时间大约为190毫秒: #include< assert.h> #include< stdio.h> #include< string.h> #include< sys / stat.h> #include< stdlib.h> #define CHUNK 16384 int letcnt [256]; int * letter_count(unsigned char * s,unsigned int len) { int i; for(i = 0; i< len; i ++){ letcnt [*(s + i)] ++; } return(letcnt); } int * letter_count_chunks(unsigned int len,FILE * f) { int i; unsigned char chunk [CHUNK]; memset(letcnt,0,sizeof(letcnt)); for(i = 0; i< len - CHUNK; i + = CHUNK){ fread(chunk,CHUNK,1,f); letter_count(chunk,CHUNK); } fread(chunk,len - i,1,f); letter_count(chunk,len - i); 返回letcnt; } void print_counts(){ int i; for(i = 0; i printf('%c':%d \ n,(unsigned char)i,letcnt [i]); $ b $ // st_size int main(int argc,char ** argv) { assert(argc == 2); FILE * f = fopen(argv [1],r); struct stat st; stat(argv [1],& st); off_t len = st.st_size; letter_count_chunks(len,f); fclose(f); print_counts(); 返回0; } Spoiler: Yes. See below.Trying to optimize a letter counter to match C. I've fought it to a 2x deficit.letterCount :: B.ByteString -> V.Vector IntletterCount bs = V.accumulate (\a _ -> a + 1) (V.replicate 256 0) letters1 where len = B.length bs letters1 = V.generate len (\i -> (fromIntegral $! B.index bs i, ()))Some notes:It was really slow until I changed Data.Vector to Data.Vector.Unboxed. Why is that?I thought most of the time would be spent in accumulate. I was wrong.70% of the time is spent in generate.Haskell code suffers from having to pointlessly convert Word8 to Int; Also, a useless army of () may or may not actually be created.Full listing:import qualified Data.ByteString as Bimport qualified Data.Vector.Unboxed as Vimport System.Environmentimport Text.PrintfletterCount :: B.ByteString -> V.Vector IntletterCount bs = V.accumulate (\a _ -> a + 1) (V.replicate 256 0) letters1 where len = B.length bs letters1 = V.generate len (\i -> (fromIntegral $! B.index bs i, ()))printCounts :: V.Vector Int -> IO ()printCounts cs = mapM_ (uncurry $ printf "%c: %d\n") (zip (map toEnum [0..255] :: String) (V.toList cs))main :: IO ()main = do filename <- fmap head getArgs f <- B.readFile filename let counts = letterCount f printCounts countsCompeting C code: #include <assert.h> #include <stdio.h> #include <string.h> #include <sys/stat.h> #include <stdlib.h> int letcnt [256]; int* letter_count(unsigned char *s, unsigned int len) { int i; memset(letcnt, 0, 256 * sizeof(int)); for(i = 0; i < len; i++){ letcnt[*(s + i)]++; } return (letcnt); } void print_counts() { int i; for(i = 0; i < 256; i++) { printf("'%c': %d\n", (unsigned char) i, letcnt[i]); } } // st_size int main(int argc, char **argv) { assert(argc == 2); FILE* f = fopen(argv[1], "r"); struct stat st; stat(argv[1], &st); off_t len = st.st_size; unsigned char* contents = calloc(len, 1); fread(contents, len, 1, f); fclose(f); letter_count(contents, len); print_counts(); return 0; }Timings;$ time ./a.out /usr/share/dict/words > /dev/nullreal 0m0.012suser 0m0.005ssys 0m0.005s$ time ./lettercount /usr/share/dict/words > /dev/nullreal 0m0.017suser 0m0.009ssys 0m0.007sUpdateI think the performance ceiling is down to this bug: runST isn't free. Not that I believe it's impossible to optimize further but unlikely to approach C so long as runST imposes some overhead.Also, fixed C-code based on @Zeta's comment. 解决方案 Yes. If you compile with -fllvm then Haskell will match C in User time. The big surprise is if you switch to Lazy Bytestrings, the Haskell version will beat the C version on Real time by a small but significant margin.import qualified Data.ByteString.Lazy.Char8 as Bimport qualified Data.Vector.Unboxed as Vimport System.Environmentimport Text.PrintfletterCount :: B.ByteString -> V.Vector IntletterCount bs = V.unsafeAccumulate (\a _ -> a + 1) (V.replicate 256 0) (parse bs)parse :: B.ByteString -> V.Vector (Int, ())parse = V.unfoldr step where step s = if B.null s then Nothing else Just ((fromIntegral . fromEnum $ B.head s, ()), B.tail s){-# INLINE parse #-}printCounts :: V.Vector Int -> IO ()printCounts cs = mapM_ (uncurry $ printf "%c: %d\n") (zip (map toEnum [0..255] :: String) (V.toList cs))main :: IO ()main = do filename <- fmap head getArgs f <- B.readFile filename let counts = letterCount f printCounts countsRemember to compile like:ghc -O2 -fllvm letterCount.hsSo, Vector + ByteString.Lazy + LLVM > C. I love it!UpdateIn fairness to C I updated the C code to use a single buffer which avoids doing a big allocation up front (or any allocations at all) and will be more cache friendly. Now the Haskell and C codes show no significant difference, both about 190ms in run-time on a best-of-3 basis against a large, 150M input file:#include <assert.h>#include <stdio.h>#include <string.h>#include <sys/stat.h>#include <stdlib.h>#define CHUNK 16384int letcnt [256];int* letter_count(unsigned char *s, unsigned int len){ int i; for(i = 0; i < len; i++){ letcnt[*(s + i)]++; } return (letcnt);}int* letter_count_chunks(unsigned int len, FILE* f){ int i; unsigned char chunk [CHUNK]; memset(letcnt, 0, sizeof(letcnt)); for(i = 0; i < len - CHUNK; i+= CHUNK) { fread(chunk, CHUNK, 1, f); letter_count(chunk, CHUNK); } fread(chunk, len - i, 1, f); letter_count(chunk, len - i); return letcnt;}void print_counts() { int i; for(i = 0; i < 256; i++) { printf("'%c': %d\n", (unsigned char) i, letcnt[i]); }}// st_sizeint main(int argc, char **argv){ assert(argc == 2); FILE* f = fopen(argv[1], "r"); struct stat st; stat(argv[1], &st); off_t len = st.st_size; letter_count_chunks(len, f); fclose(f); print_counts(); return 0;} 这篇关于在计算字母时,Haskell可以击败C吗?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!
10-29 21:15