问题描述
在这个编码中我试图读取特定的行和写在另一个记事本上。这个编码适用于英文字符。但是对于泰米尔语,如果我试图计算它算作:
(例如)தமிழ்
计为5 ..(即)த,ம,ி,ழ和哒。
但我想把它算作3(即)த,மி和ழ்
我尝试过:
in this coding i tries to read the lines specific with and write on another notepad.this coding works well for english characters.but for tamil if i tries to count it count as:
(e.g)தமிழ்
it counts as 5..(i.e)"த", "ம", "ி", "ழ" and "்".
but i want to count it as 3(i.e)"த", "மி" and "ழ்"
What I have tried:
import java.io.BufferedReader;
import java.io.BufferedWriter;
import java.io.FileReader;
import java.io.FileWriter;
public class ii {
public static void main(String[] args) {
FileReader fr = null;
BufferedReader br =null;
FileWriter fw=null;
BufferedWriter bw=null;
String [] stringArray;
int counLine = 0;
int arrayLength ;
String s="";
String stringLine="";
try{
fr = new FileReader("F:\\New folder (2)\\N.txt");
fw=new FileWriter("F:\\New folder (2)\\o.txt");
br = new BufferedReader(fr);
bw=new BufferedWriter(fw);
while((s = br.readLine()) != null){
stringLine = stringLine + s;
stringLine = stringLine + " ";
counLine ++;
}
stringArray = stringLine.split(" ");
arrayLength = stringArray.length;
for (int i = 0; i < arrayLength; i++) {
int c = 1 ;
for (int j = i+1; j < arrayLength; j++) {
if(stringArray[i].equalsIgnoreCase(stringArray[j])){
c++;
for (int j2 = j; j2 < arrayLength; j2++)
{
}}
int k;
for(k=2;k==stringArray[i].length();i++)
{
bw.write(stringArray[i]);
bw.newLine();
}}} fr.close();
br.close();
bw.flush();
bw.close();
}catch (Exception e) {
e.printStackTrace();
}}}
推荐答案
(例如)தமிழ்
它算作5 ..(即)த,ம,ி,ழ和 。
但我想把它算作3(即)த,மி和ழ்
(e.g)தமிழ்
it counts as 5..(i.e)"த", "ம", "ி", "ழ" and "்".
but i want to count it as 3(i.e)"த", "மி" and "ழ்"
你的问题来自事实有些字符是超过1个字符的复合。
有些字符有sufix
又名மி是前导字符ம和后缀字符compound的复合。
我不知道你的字母表,而不是像拉丁字母一样逐一读取字符,你必须检测字符是否有后缀或没有。
要么你需要一个char +后缀的所有组合的列表,要么你必须检查实际的char是否后跟一个后缀char。
无论如何你需要更改你的代码来处理这种情况。
[更新]
Your problem comes from the fact that some chars are compound of more than 1 char.
Some chars have sufix
aka "மி" is compound of a leading char "ம" and a suffix char "ி".
I don't know your alphabet, but rather than reading chars 1 by 1 like with Latin alphabet, you have to detect if a char have a suffix or not.
Either you need a list of all combination of char+suffix, either you have to check if actual char is followed by a suffix char.
In any case you need to change your code to handle the situation.
[Update]
是的我在字符串中有可能的字母
字符串s =ஃஅஆஇஈஉஊஎஏஐஒஓஔக்ககாகிகீகுகூகெகேகைகொகோகௌஙககாகிகீகுகூகெகேகைகொகோகௌஙஙஙாஙிஙீஙுஙூஙெஙேஙைஙொஙோஙௌசஙஙாஙிஙீஙுஙூஙெஙேஙைஙொஙோஙௌசஙஙாஙிஙீஙுஙூஙெஙேஙைஙொஙோஙௌசசசாசிசீசுசூசெசேசைசொசோசௌஞசசாசிசீசுசூசெசேசைசொசோசௌஞ;;;;;;
yes i have possible letters in a string
String s = "ஃஅஆஇஈஉஊஎஏஐஒஓஔக்ககாகிகீகுகூகெகேகைகொகோகௌங்ஙஙாஙிஙீஙுஙூஙெஙேஙைஙொஙோஙௌச்சசாசிசீசுசூசெசேசைசொசோசௌஞ்ஞஞாஞிஞீஞுஞூஞெஞேஞைஞொஞோஞௌட்டடாடிடீடுடூடெடே";
然后你必须检查实际的char和下一个是否在字符串中。
Then you have to check if actual char and next one are in the string.
这篇关于如何使用java分割泰米尔语字母的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!