我已经建立了一个KCL plus spark
https://spark.apache.org/docs/1.2.0/streaming-kinesis-integration.html
我在EMR(通过引导程序安装的火花)上运行此程序。我已经在流sparkTest上创建并进行了正常的测试。我观察到没有创建DynamoDB。
我已删除流和群集。第二天,我又用相同的名称创建了Kinesis Steam,并用新启动的集群部署了我的代码。
现在我正在
5/06/12 08:17:28 ERROR worker.InitializeTask: Caught exception:
com.amazonaws.services.kinesis.model.InvalidArgumentException: StartingSequenceNumber 49551532098093284204238000035066183240246145871536717826 used in GetShardIterator on shard shardId-000000000000 in stream sparkTest under account 618673372431 is invalid because it did not come from this stream. (Service: AmazonKinesis; Status Code: 400; Error Code: InvalidArgumentException; Request ID: 770ef875-10db-11e5-b24b-af6f372168ae)
at com.amazonaws.http.AmazonHttpClient.handleErrorResponse(AmazonHttpClient.java:1078)
at com.amazonaws.http.AmazonHttpClient.executeOneRequest(AmazonHttpClient.java:726)
at com.amazonaws.http.AmazonHttpClient.executeHelper(AmazonHttpClient.java:461)
at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClie
我不明白为什么会这样。如果我创建新的运动流,然后再工作,它将再次起作用。
Kinesis有问题吗?
另一个线程正在与
https://github.com/awslabs/amazon-kinesis-connectors/issues/8
但是我没有使用kinesis应用程序名称并使用创建流
KinesisUtils.createStream(
jssc, streamName, endpointUrl, kinesisCheckpointInterval, InitialPositionInStream.LATEST, StorageLevel.MEMORY_AND_DISK_2())
最佳答案
SparkConf sparkConfig = new SparkConf().setAppName("arbitraryName").setMaster("local[2]");
KinesisUtils.createStream(
jssc, streamName, endpointUrl, kinesisCheckpointInterval, InitialPositionInStream.LATEST, StorageLevel.MEMORY_AND_DISK_2()));
如果我更改名称“ arbitraryName”。它工作正常。我从
https://spark.apache.org/docs/1.2.0/streaming-kinesis-integration.html
key points:
The application name used in the streaming context becomes the Kinesis application name
The application name must be unique for a given account and region.
关于emr - Kinesis GetShardIterator…无效,因为它不是来自此流,我们在Stack Overflow上找到一个类似的问题:https://stackoverflow.com/questions/30798709/