问题描述
我们的.NET应用程序有一个x文件问题。或者,而是混合Win32和.NET应用程序。当它尝试与Oracle通信时,它刚刚死机。消失。去天空中的大黑色空洞。没有事件日志消息,没有例外,没有任何东西。
如果我们只是要求应用程序与MS SQL Server通话,这具有替换使用OracleConnection和相关类与SqlConnection和相关类,它的工作原理。
今天我们有一个突破。
基本上是100如果从桌面上的目录运行,该应用程序可以工作,并且如果从根目录中运行失败。
今天我们发现计算的差异在目录名称中的空格。
所以这些目录可以正常工作:
code> C:\Program Files\AppDir\Executable.exe
C:\Temp Lemp\AppDir\Executable.exe
C:\Documents and Settings\ someuser\Desktop\AppDir\Executable.exe
而这些不会:
C:\CompanyName\AppDir\Executable.exe
C:\Programfiler\AppDir\Executable.exe < - 挪威文件
C:\Temp\AppDir\Executable.exe
我希望有人读过这个类似的行为,并且有一个aha,你需要旋转oracle glitz驱动程序配置或类似的。
任何人?
关注#1:好的,我已经处理procmon输出现在,这两个文件从我打了一个尝试打开触发级联失败的窗口的按钮,我注意到他们主要跟踪,两个文件顶部附近有一些小的差异,他们他们跟踪很长的路要走。
但是,当一个运行失败时,另一个运行失败,并且日志输出的下几行是这些:
ReadFile C:\oracle\product\10.2.0\db_1\BIN\orageneric10.dll SUCCESS Offset:274 432,长度:32 768,I / O标志:非缓存,寻呼I / O,同步寻呼I / O
ReadFile C:\ oracle\product\10.2.0\db_1\BIN\ orageneric10 .dll SUCCESS偏移量:233 472,长度:32 768,I / O标志:非缓存,分页I / O,同步分页I / O
此后,工作运行继续执行,另一个在线程关闭之前触发mscorwks.dll文件几次,并关闭应用程序。因此,失败的运行不会触及上述文件。
跟随#2:我会尝试升级oracle客户端驱动程序,但10.2.0.1显然是Windows 2003服务器和XP客户端的最高版本。
关注#3:嗯,我们最终得到了一个黑盒子的解决方案。基本上我们发现这个问题与和Oracle有关。 XPO有一个称为XPObjectType的系统表,它具有三列:Oid,TypeName和AssemblyName。由于我们在数据库中配置了Oracle,因此列名称为OID,TYPENAME和ASSEMBLYNAME。这通常不是一个问题,除了XPO直接与模式信息进行交谈,并检查表中是否存在正确的列名称,XPO不处理大小写差异,因此它会看到具有三个未知列的XPObjectType表,也没有那么它预期的那样。
正是XPO现在我真的不知道,但是如果我放弃了这个表,并用正确的案例重新创建了它,使用双引号围绕所有列名称来获取案例,问题不会出现。
文件夹名称中的空间正好位于这里,我还没有想法,但这个问题有两个层次:
- 停止应用程序崩溃我们的客户,短期解决方案
- 修复错误,长期解决方案
现在第1层解决了,第2层将被放回现在的队列和优先级。我们正在面对数据层的一些更大的变化,所以这可能不是我们需要解决的问题,至少如果我们所有的Oracle客户验证表修复实际上摆脱了这个问题。
我接受的答案,因为过程监视器File Monitor的大哥)并没有明确指出这个问题,所以我可以用它来确定在XPO已经建立了这个表的查询的用户代码中的断点之后,直到所有的记录关闭的应用程序的条目已被记录,这导致我相信这个表是歹徒,或者至少影响了这个问题。
如果我设法得到真正的原因,我会更新帖子。
这是我会做的。首先,TRIPLE检查您是否看到您认为自己看到的行为。我可以看到这种情况发生在另一个方面,不使用System.IO.Path来连接路径,但不像你看到的那样。三重检查文件权限是否有意义。
接下来,下载,看看文件系统发生了什么,因为你的程序遇到了这些困扰的地方。您可以筛选出特定的文件活动(例如,删除您的防病毒文件活动),以便在执行此操作时使所有内容变得更加清洁。使用FileMon查找文件访问错误的成功案例和您的程序的错误情况。这应该指向您正在访问哪些文件并导致问题。例如,如果您看到一个 FILE_NOT_FOUND
错误访问无意义的文件名,您可以放心,您或供应商做错了事情,可能导致您的问题...
We have an x-files problem with our .NET application. Or, rather, hybrid Win32 and .NET application.
When it attempts to communicate with Oracle, it just dies. Vanishes. Goes to the big black void in the sky. No event log message, no exception, no nothing.
If we simply ask the application to talk to a MS SQL Server instead, which has the effect of replacing the usage of OracleConnection and related classes with SqlConnection and related classes, it works as expected.
Today we had a breakthrough.
For some reason, a customer had figured out that by placing all the application files in a directory on his desktop, it worked as expected with Oracle as well. Moving the directory down to the root of the drive, or in C:\Temp or, well, around a bit, made the crash reappear.
Basically it was 100% reproducable that the application worked if run from directory on desktop, and failed if run from directory in root.
Today we figured out that the difference that counted was wether there was a space in the directory name or not.
So, these directories would work:
C:\Program Files\AppDir\Executable.exe
C:\Temp Lemp\AppDir\Executable.exe
C:\Documents and Settings\someuser\Desktop\AppDir\Executable.exe
whereas these would not:
C:\CompanyName\AppDir\Executable.exe
C:\Programfiler\AppDir\Executable.exe <-- Program Files in norwegian
C:\Temp\AppDir\Executable.exe
I'm hoping someone reading this has seen similar behavior and have a "aha, you need to twiddle the frob on the oracle glitz driver configuration" or similar.
Anyone?
Followup #1: Ok, I've processed the procmon output now, both files from when I hit the button that attempts to open the window that triggers the cascade failure, and I've noticed that they keep track mostly, there's some smallish differences near the top of both files, and they they keep track a long way down.
However, when one run fails, the other keeps going and the next few lines of the log output are these:
ReadFile C:\oracle\product\10.2.0\db_1\BIN\orageneric10.dll SUCCESS Offset: 274 432, Length: 32 768, I/O Flags: Non-cached, Paging I/O, Synchronous Paging I/O
ReadFile C:\oracle\product\10.2.0\db_1\BIN\orageneric10.dll SUCCESS Offset: 233 472, Length: 32 768, I/O Flags: Non-cached, Paging I/O, Synchronous Paging I/O
After this, the working run continues to execute, and the other touches the mscorwks.dll files a few times before threads close down and the app closes. Thus, the failed run does not touch the above files.
Followup #2: Figured I'd try to upgrade the oracle client drivers, but 10.2.0.1 is apparently the highest version available for Windows 2003 server and XP clients.
Followup #3: Well, we've ended up with a black-box solution. Basically we found that the problem is somewhere related to XPO and Oracle. XPO has a system-table it manages, called XPObjectType, with three columns: Oid, TypeName and AssemblyName. Due to how Oracle is configured in the databases we talk to, the column names were OID, TYPENAME and ASSEMBLYNAME. This would ordinarily not be a problem, except that XPO talks to the schema information directly and checks if the table is there with the right column names, and XPO doesn't handle case differences so it sees a XPObjectType table with three unknown columns and none of those it expects.
Exactly what XPO does now I don't really know, but if I dropped this table, and recreated it with the right case, using double quotes around all the column names to get the case right, the problem doesn't crop up.
Exactly where the space in the folder name comes into this, I still have no idea, but this problem had two tiers:
- Stop the application from crashing at our customers, short-term solution
- Fix the bug, long-term solution
Right now tier 1 is solved, tier 2 will be put back into the queue for now and prioritized. We're facing some bigger changes to our data tier anyway so this might not be a problem we need to solve, at least if all our Oracle-customers verify that the table-fix actually gets rid of the problem.
I'll accept the answer by Dave Markle since though Process Monitor (the big brother of File Monitor) didn't actually pinpoint the problem, I was able to use it to determine that after my breakpoint in user-code where XPO had built up the query for this table, no I/O happened until all the entries for the application closing down was logged, which led me to believe it was this table that was the culprit, or at least influenced the problem somehow.
If I manage to get to the real cause of this, I'll update the post.
Here's what I would do. First, TRIPLE-check that you're seeing the behavior you think you're seeing. I can see this happening the other way around by not using System.IO.Path to concatenate paths, but not like you're seeing it. Triple-check that the file permissions make sense.
Next, download Filemon from MS and watch what's happening on the filesystem as your program hits these troubled spots. You can filter out specific file activity (removing your anti-virus file activity, for example) to make everything look a bit cleaner while you do this. Look for file access errors using FileMon for both the success case and the error case for your program. That should point you to what file's being accessed and causing the problem. For example, if you see a FILE_NOT_FOUND
error accessing a nonsense filename, you can be assured that you or the vendor are doing something wrong, possibly leading to your problem...
这篇关于与oracle通话时应用程序崩溃,除非可执行路径包含空格的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!