Your best source of information and news about hardware , drivers and vista on the internet حسابك أفضل مصدر للمعلومات والأخبار عن المعدات والسائقين ويندوز فيستا على شبكة الإنترنت

Vista ARTICLES مقالات فيستا TOP 50 أفضل 50 Spyware Virus فيروس برامج التجسس Vista SOFT ميسر فيستا Vista HELP فيستا مساعدة

Exchange Server Error -1018: How Microsoft IT Recovers Damaged Exchange Databases خادم تبادل خطأ -1018 : كيف يسترد مايكروسوفت أنها تضررت تبادل قواعد البيانات


I found this paper on the showcase site from microsoft I hope it is of help to you. لقد وجدت هذه الورقة على عرض من موقع مايكروسوفت وآمل أن يكون للمساعدة لكم.

It sure helped me! وهو متأكد من ساعدني!

Technical White Paper التقنية ورقة بيضاء

Published: August 1, 2005 النشر : أغسطس 1 ، 2005

Executive Summary موجز تنفيذي

Error –1018 (JET_errReadVerifyFailure) is a familiar—and dreaded—error in Microsoft® Exchange Server. خطأ -1018 (JET_errReadVerifyFailure) هو مألوف والمرعب بين خطأ في خادم تبادل مايكروسوفت ®. It indicates that an Exchange database file has been damaged by a failure or problem in the underlying file system or hardware. وهو يشير إلى أن تبادلا ملف قاعدة البيانات قد تتعرض للضرر بسبب فشل أو المشكلة الكامنة في نظام الملفات أو المعدات.

This paper explains the conditions that result in error –1018. هذه الورقة توضح الظروف التي تؤدي إلى خطأ -1018. It also covers the detection mechanisms that Exchange uses to discover and recover from damage to its database files. وهي تشمل أيضا الكشف عن الآليات التي تستخدم لاكتشاف تبادل والتعافي من الضرر لملفات قاعدة بياناتها.

The Microsoft Information Technology group (Microsoft IT) runs one of the most extensive Exchange Server organizations in the world. مايكروسوفت تكنولوجيا المعلومات في مجموعة (مايكروسوفت تكنولوجيا المعلومات) ويدير واحدة من أكثر المنظمات خادم تبادل واسعة النطاق في العالم. Exchange administrators at Microsoft have investigated and recovered from dozens of –1018 error problems. تبادل المديرين في شركة مايكروسوفت قد تعافى من التحقيق والعشرات من خطأ -1018 المشاكل. This paper shows you how Microsoft IT monitors for this error, what happens after database file damage has been discovered, and how Microsoft recovers databases affected by the problem. هذه الورقة وتبين لكم كيف مايكروسوفت وترصد لهذا الخطأ ، ماذا يحدث بعد ملف قاعدة البيانات تم اكتشاف الضرر ، وكيف يسترد قواعد البيانات مايكروسوفت المتضررين من هذه المشكلة.

Note: For security reasons, the sample names of forests, domains, internal resources, organizations, and internally developed security file names used in this paper do not represent real resource names used within Microsoft and are for illustration purposes only. ملاحظة : لأسباب أمنية ، عن أسماء عينة من الغابات ، المجالات ، والموارد الداخلية والمنظمات ، ووضع الملف الأمني داخليا الأسماء المستخدمة في هذه الورقة لا تمثل الأسماء الحقيقية في الموارد المستخدمة داخل مايكروسوفت وهي لأغراض التوضيح فقط.

Readers of this paper are assumed to be familiar with the basics of Exchange administration and database architecture. قراء هذه الورقة هي ويفترض أن يكون على دراية أساسيات إدارة البورصة وبنية قاعدة البيانات. This paper describes Microsoft IT's experience and recommendations for dealing effectively with error –1018. وتصف هذه الورقة تجربة مايكروسوفت لتكنولوجيا المعلومات والتوصيات للتعامل بفعالية مع الخطأ -1018. It is not intended to serve as a procedural guide. ومن لا يقصد به أن يكون بمثابة دليل إجرائي. Each enterprise environment has unique circumstances; therefore, each organization should adapt the material to its specific needs. كل مؤسسة البيئة الظروف الفريدة ، ولذلك فإن كل منظمة من المنظمات أن تكييف المواد لاحتياجات محددة.

While the focus here is on Exchange Server 2003, nearly all the material covered applies to any version of Exchange. وفي حين أن التركيز هنا هو على تبادل الخادم 2003 ، ما يقرب من جميع المواد المشمولة تنطبق على أي صيغة للتبادل. Exchange Server 2003 implements important new functionality for recovering from –1018 errors. تبادل خادم 2003 وتنفذ وظائف جديدة هامة ليتعافى من أخطاء -1018. This is discussed in "ECC Page Correction in Exchange Server 2003 SP1" later in this document. وتجري مناقشة ذلك في "الرعاية في مرحلة الطفولة المبكرة الصفحة تصحيح في تبادل ملقم 2003 SP1" في وقت لاحق من هذه الوثيقة.

Introduction مقدمة

No computer data storage mechanism is perfect. أي آلية لتخزين المعلومات والحاسوب هو الكمال. Disks and tapes go bad. الأقراص والأشرطة الذهاب السيئة. Glitches in hardware or bugs in firmware can cause data to be corrupted. مواطن الخلل في الأجهزة أو البرامج الثابتة في البق يمكن أن يسبب البيانات إلى أن يكون فاسدا. The most basic strategy for dealing with this reality is redundancy: disks are mirrored or replicated; data is backed up to remote locations so that when—not if—primary storage is compromised, data can be recovered from another copy. أبسط استراتيجية للتعامل مع هذا الواقع هو التكرار : أقراص مضمنة أو تكرارها ؛ البيانات هو دعم لمواقع نائية حتى عندما - إذا لم الابتدائي التخزين هو الشبهة ، ويمكن استعادة البيانات من نسخة أخرى.

Loss of data is not the only risk when data becomes corrupted. خسارة للبيانات ليس فقط للخطر عندما تصبح البيانات للتلف. If corruption is undetected, bad decisions may be made based on the data. وإذا كان الفساد هو دون ، قرارات سيئة ويمكن استنادا إلى البيانات. Stories are occasionally reported in the press about a decimal point that is removed by random corruption of a database record, and someone becomes a temporary millionaire as a result. قصص أحيانا في الصحف عن نقطة عشرية التي تنقل عن طريق الفساد عشوائية من قاعدة بيانات السجل ، ويصبح شخص ما مليونيرا مؤقتة نتيجة لذلك. Corruption of a database can cause even more subtle or difficult errors. الفساد من قاعدة بيانات يمكن أن تسبب أو حتى أكثر دهاء من الصعب الأخطاء. In Exchange, acting on a piece of corrupted metadata could cause mail destined for one user to be sent to another, or could cause all mail in a database to be lost. وفي مقابل ذلك ، بناء على قطعة من فساد الفوقية قد يسبب البريد المخصصة لمستخدم واحد لإرساله إلى آخر ، أو يمكن أن يسبب كل البريد في قاعدة بيانات لتكون فقدت.

Exchange databases therefore implement functionality to detect such damage. تبادل قواعد البيانات وبالتالي تنفيذ وظائف للكشف عن هذا الضرر. Even more important than detecting random corruption is not acting on it. حتى أكثر أهمية من كشف الفساد عشوائية لا تتصرف على ذلك. After Exchange detects damage to its databases, the damaged area is treated as if it were completely unreadable. بعد تبادل بالكشف عن الأضرار التي لحقت قواعد بياناتها ، وتضررت منطقة يعامل كما لو كان غير المقروء تماما. Thus, the database cannot be further harmed by relying on the data. وهكذا ، فإن قاعدة البيانات لا يمكن أن يسيء كذلك إلى الاعتماد على البيانات.

The error code –1018 is reported when Exchange detects random corruption of its data by a problem in the underlying platform. رمز الخطأ -1018 هو أبلغ عندما تبادل بالكشف عن الفساد عشوائية من البيانات من جانب مشكلة في منصة الكامنة. Although data corruption is a serious problem, it is rare for a –1018 error detected during database run time to cause the database to stop or to seriously malfunction. ورغم بيانات الفساد هو مشكلة خطيرة ، ومن النادر ل-1018 خلال اكتشاف خطأ وقت تشغيل قاعدة البيانات لسبب قاعدة البيانات لوقف أو لخلل خطير. This is because the majority of pages in an Exchange database have user message data written on them. وذلك لأن غالبية الصفحات في تبادل لقاعدة بيانات المستخدم رسالة بيانات خطية عليهم. The loss of a single random page in the database is most likely to result in lost messages. فقدان واحد صفحة عشوائية في قاعدة البيانات هي على الأرجح نتيجة لفقدت في الرسائل. One user or group of users may be affected, but there is no impact to the overall structural and logical integrity of the database. مستخدم واحد أو مجموعة من المستخدمين يمكن أن تتأثر ، ولكن لا يوجد أثر لالشاملة والهيكلية ومنطقية من سلامة قاعدة البيانات. After a –1018 problem has been detected, Exchange will keep running as long as the lost data is not critical to the integrity of the database as a whole. بعد -1018 تم اكتشاف المشكلة ، وسيبقي على التوالي تبادل طالما فقدت البيانات ليست حرجة لسلامة قاعدة البيانات ككل.

A –1018 error may be reported repeatedly for the same location in the database. وهناك خطأ قد يكون -1018 ذكرت مرارا وتكرارا لنفس الموقع في قاعدة البيانات. This can happen if a user tries repeatedly to access a particular damaged message. وهذا يمكن أن يحدث إذا مستخدم يحاول مرارا وتكرارا إلى وصول رسالة خاصة تضررت. Each time the access will fail, and each time a new error will be logged. في كل مرة الوصول سوف تفشل ، وكل مرة جديدة سيكون خطأ تقم بتسجيل الدخول.

Because the immediate loss of data associated with error –1018 may be minimal, you may be tempted to ignore the error. لأن خسارة فورية من البيانات المرتبطة بها خطأ مع -1018 قد يكون الحد الأدنى ، كنت قد يغري ذلك لتجاهل هذا الخطأ. That would be a dangerous mistake. وهذا من شأنه أن يكون خطأ خطيرا. A –1018 error must be investigated thoroughly and promptly. ألف -1018 خطأ ويجب التحقيق بدقة وبسرعة. Error –1018 indicates the possibility of other imminent failures in the platform. خطأ -1018 تشير إلى إمكانية أخرى وشيكة الفشل في منصة.

Understanding Error –1018 فهم خطأ -1018

Error code –1018 (JET_errReadVerifyFailure) means one of two conditions has been detected when reading a page in the database: رمز الخطأ -1018 (JET_errReadVerifyFailure) يعني واحد من اثنين من شروط تم اكتشاف عند قراءة صفحة في قاعدة البيانات :

  • The logical page number recorded on the page does not correspond to the physical location of the page inside the database file. المنطقي صفحة عدد سجلت على صفحة لا يتفق مع الموقع الجغرافي للصفحة داخل ملف قاعدة البيانات.

  • The checksum recorded on the page does not match the checksum Exchange expects to find on the page. وقد سجلت الاختباري على الصفحة لا يتناسب مع اختباري تبادل تتوقع العثور على الصفحة.

Statistically, a –1018 error is much more likely to be related to a wrong checksum than to a wrong page number. إحصائيا ، وهو خطأ -1018 هو أكثر بكثير من المحتمل أن تكون ذات صلة خاطئة اختباري من الخطأ إلى رقم الصفحة.

To understand why these conditions indicate file-level damage to the database, you need to know a little more about how Exchange database files are organized. نفهم لماذا هذه الظروف تشير إلى ملف على مستوى الأضرار التي لحقت قاعدة البيانات ، تحتاج إلى معرفة قليلا المزيد عن كيفية تبادل ملفات قاعدة بيانات المنظمة.

Page Ordering الصفحة الأمر

Each Exchange Server 2003 database consists of two matched files: the .edb file and the .stm file. كل إكستشينج سيرفر 2003 قاعدة بيانات تتألف من سنتين يقابل الملفات :. ملف البنك المصري لتنمية الصادرات و. stm ملف. These files must be copied, moved, or backed up together and must remain synchronized with each other. هذه الملفات يجب أن يكون نسخ ، نقل ، أو تدعمها معا ويجب أن تبقى متزامنة مع بعضها البعض.

Inside the database files, data is organized in sequential 4-kilobyte (KB) (4,096 byte) pages. ملفات داخل قاعدة البيانات ، بيانات المنظمة في متتابعة 4 - كيلوبايت (كيلوبايت) (4096 بايت) صفحة. Several pages can be collected together to form logical structures called balanced trees (B+-Trees). عدة صفحات يمكن جمعها معا لتشكيل هياكل منطقية ودعا متوازن الأشجار (باء + بين الأشجار). Several of these trees are linked together to form database tables. العديد من هذه الأشجار وترتبط معا لتشكل جداول قاعدة البيانات. There may be thousands of tables in a database, depending on how many mailboxes or folders it hosts. قد يكون هناك الآلاف من الجداول في قاعدة بيانات ، اعتمادا على مدى العديد من صناديق البريد أو المجلدات استضافتها.

Each page is owned by a single B+-Tree, and each B+-Tree is owned by a single table. كل صفحة وتملكها شركة واحدة + B - شجرة ، كل وباء + شجرة مملوكة لجدول واحد. Error –1018 reports damage at the level of individual pages. خطأ -1018 التقارير الضرر على مستوى كل صفحة. Because database tables are made up of pages, the error also implies problems at the higher logical levels of the database. لأن جداول قاعدة البيانات هي التي تتألف من صفحات ، ينطوي على خطأ أيضا مشاكل في أعلى المستويات المنطقية من قاعدة البيانات.

At the beginning of each database file are two header pages. في بداية كل ملف قاعدة البيانات هما رأس صفحة. The header pages record important information about the database. رأس صفحة سجل معلومات هامة عن قاعدة البيانات. You can view the information on the header pages with the Exchange Server Database Utilities tool Eseutil. يمكنك عرض المعلومات على رأس صفحة مع خادم تبادل المنافع أداة Eseutil قاعدة البيانات.

After the header pages, every other page in a database file is either a data page or an empty page waiting for data. بعد رأس صفحة ، كل صفحة في ملف قاعدة البيانات هو إما البيانات أو صفحة فارغة تنتظر لصفحة البيانات. Each data page is numbered, in sequential order, starting at 1. كل صفحة البيانات المرقمة من هو ، بشكل متتابع ، بدءا من 1. Because of the two header pages at the beginning of the file, the third physical page is the first logical data page in the database. لأن اثنين من رأس صفحة في بداية الملف ، الصفحة الثالثة البدنية هو أول من المنطقي صفحة البيانات في قاعدة البيانات. (You can consider the two header pages to be logical pages -1 and 0.) (يمكنك أن تنظر في رأس صفحة لاثنين من المنطقي صفحات -1 و0.)

Note: Each database file as a whole has a header, and each page in a database also has its own header. ملاحظة : كل ملف قاعدة البيانات ككل رأسية ، وكل صفحة في قاعدة بيانات خاصة بها كما رأس. It can be confusing to distinguish between the two. ويمكن أن تكون مربكة للتمييز بين البلدين.

The database header is at the beginning of the database file and it records information about the database as a whole. قاعدة البيانات هو رأس في بداية ملف قاعدة البيانات والسجلات معلومات عن قاعدة البيانات ككل. A page header is the first 40 bytes of each and every page, and it records important information only about that particular page. ألف رأس الصفحة هو أول 40 بايت من كل صفحة ، وسجلات فقط معلومات هامة عن أن صفحة خاصة. Just as Eseutil can display database header information, it can also display page header fields. كما يمكن عرض Eseutil قاعدة بيانات رأس المعلومات ، كما يمكن عرض الصفحة رأسية المجالات.

In an Exchange database, you can easily calculate which logical page you are on for any physical byte offset into the database file. في تبادل لقاعدة البيانات ، يمكنك بسهولة حساب التي المنطقي صفحة لانك على أي تعويض المادية البايت في ملف قاعدة البيانات. Logical page –1, which is the first copy of the database header, starts at offset 0. صفحة المنطقي -1 ، التي هي أول نسخة من قاعدة بيانات رأسية ، ويبدأ في تعويض 0. Logical page 0, a second copy of the database header, starts at offset 4,096. صفحة 0 من المنطقي ، ونسخة ثانية من قاعدة البيانات رأسية ، ويبدأ في تعويض 4096. Logical page 1, the first data page in the database, starts at offset 8,192. الصفحة 1 من المنطقي ، وهي أول صفحة البيانات في قاعدة البيانات ، ويبدأ في تعويض 8192. Logical page 2 starts at offset 12,228, and so on. الصفحة 2 من المنطقي يبدأ في تعويض 12228 ، وهلم جرا.

Each –1018 error is for a single page in the database, and it can be useful in advanced troubleshooting to be able to locate the exact page where the error occurred. كل خطأ -1018 هي لصفحة واحدة في قاعدة البيانات ، ويمكن أن يكون مفيدا في استكشاف الأخطاء وإصلاحها المتقدمة لتكون قادرة على تحديد مكان بالضبط في الصفحة حيث حدث خطأ.

As general formulas: كما صيغ عامة :

  • (Logical page number + 1) × 4,096 = byte offset (صفحة عدد المنطقي + 1) × 4.096 = البايت تعويض

  • (byte offset ÷ 4,096) – 1 = logical page number (البايت تعويض ÷ 4096) -- 1 = عدد من المنطقي صفحة

These examples may be useful: هذه الأمثلة قد يكون من المفيد :

Suppose you need to know the exact byte offset for logical page 101 in a database. لنفترض أن عليك أن تعرف بالضبط البايت المنطقي للتعويض عن 101 صفحة في قاعدة بيانات. Using the first formula, (101 + 1) × 4,096 = 417,792, logical page 101 starts exactly 417,792 bytes into the file. وباستخدام صيغة الأولى ، (101 + 1) × 4096 = 417792 ، 101 صفحة ويبدأ من المنطقي تماما 417792 بايت إلى الملف.

Now, suppose you need to know what page is at byte offset 4,104,192. والآن لو افترضنا أن ما تحتاج اليه لمعرفة ما هو في صفحة البايت تعويض 4104192. Using the second formula, (4,104,192 ÷ 4,096) – 1 = 1,001, logical page 1,001 starts at 4,104,192 bytes into the file. باستخدام الصيغة الثانية ، (4104192 ÷ 4096) -- 1 = 1001 ، صفحة 1001 من المنطقي تبدأ في 4104192 بايت إلى الملف.

In most cases, a Windows Application Log event reporting error –1018 will list the location of the bad page as a byte offset. وفي معظم الحالات ، ويندوز ادخل الحدث الإبلاغ خطأ سوف -1018 قائمة من موقع صفحة سيئة باعتبارها البايت تعويض. Therefore, the second formula is likely to be the most frequently used. ولذلك ، فإن الصيغة الثانية من المرجح أن يكون الأكثر استخداما. In any case, the two formulas allow you to translate back and forth between logical pages and byte offsets as needed. في أي حال ، فإن اثنين من صيغ تسمح لك لترجمة جيئة وذهابا بين صفحات المنطقي والبايت موازنة حسب الحاجة.

The logical page number is actually recorded on each page in the database. صفحة عدد المنطقي هو فعلا سجلت على كل صفحة في قاعدة البيانات. (In Exchange Server 2003 with Service Pack 1 (SP1), the method for doing this has changed. For more details, see "ECC Page Correction in Exchange Server 2003 SP1" later in this document.) When Exchange reads a page, it checks whether the logical page number matches the byte offset. (وفي تبادل خادم 2003 مع حزمة الخدمات 1 (SP1) ، وطريقة للقيام بذلك قد تغير. ولمزيد من التفاصيل ، انظر "الرعاية في مرحلة الطفولة المبكرة الصفحة تصحيح في تبادل ملقم 2003 SP1" في وقت لاحق من هذه الوثيقة.) تبادل وعندما يقرأ صفحة ، الشيكات ما إذا كان من المنطقي صفحة عدد المباريات فإن البايت تعويض. If it does not match, a –1018 error results, and the page is treated as unreadable. إذا لم تقم المباراة ، أ -1018 خطأ النتائج ، وصفحة يعامل باعتباره غير المقروء.

The correspondence between physical and logical pages is important because it allows Exchange to detect whether its pages have been stored in correct order in the database files. المراسلات بين المادية والمنطقية صفحات مهم لأنه يتيح تبادل لاكتشاف ما إذا كانت صفحة وقد تم تخزينها في الصحيح من أجل الملفات في قاعدة البيانات. If the physical location does not match the logical page number, the page was written to the wrong place in the file system. وإذا كان الموقع المادي لا يتناسب مع عدد من المنطقي صفحة ، صفحة وكتب إلى المكان الخطأ في نظام الملفات. Even if the data on the page is correct, if the page is in the wrong place, Exchange will detect the problem and not use the page. حتى لو كانت البيانات على الصفحة هو الصحيح ، إذا كانت الصفحة هو في المكان الخطأ ، وسوف تبادل كشف المشكلة وليس استخدام الصفحة.

Page Checksum الصفحة اختباري

Along with the logical page number, each page in the database also stores a calculated checksum for its data. جنبا إلى جنب مع عدد من المنطقي صفحة ، كل صفحة في قاعدة البيانات أيضا مخازن محسوبة اختباري للبيانات. The checksum is at the beginning of the page and is derived by running an algorithm against the data on the page. فإن هو اختباري في بداية الصفحة ومستمد من جانب إدارة خوارزمية ضد البيانات على الصفحة. This algorithm returns a 4-byte checksum number. ويعيد هذا خوارزمية 4 - اختباري عدد البايت. If something on a page changes, the checksum on the page will no longer match the data on the page. وإذا كان شيء على صفحة التغييرات ، والاختباري على الصفحة لن المباراة البيانات على الصفحة. (In Exchange Server 2003 SP1, the checksum algorithm has become more complicated than this, as you will learn in the next section.) (وفي تبادل خادم 2003 SP1 ، اختباري خوارزمية أصبح أكثر تعقيدا من هذا ، وأنتم سوف يتعلم في القسم التالي.)

Every time Exchange reads a page in the database, it runs the checksum algorithm again and makes sure the result is the same as the checksum already on the page. في كل مرة يقرأ تبادل صفحة في قاعدة البيانات ، يدير اختباري خوارزمية مرة أخرى والتأكد من النتيجة واحدة حيث اختباري بالفعل على الصفحة. If it is not, something has changed on the page. إذا لم يكن ، شيء قد تغير على الصفحة. A –1018 error is logged, and the page is treated as unreadable. وهناك خطأ -1018 هو تقم بتسجيل الدخول ، وتعامل على أنها هي صفحة غير المقروء.

ECC Page Correction in Exchange Server 2003 SP1 الرعاية في مرحلة الطفولة المبكرة الصفحة تصحيح في خادم تبادل 2003 SP1

Exchange Server 2003 SP1 includes an important new recovery mechanism for some –1018 related damage. تبادل خادم 2003 SP1 هاما يتضمن آلية جديدة لاسترداد بعض الأضرار المتصلة -1018. This mechanism is an Error Correction Code (ECC) checksum that is placed on each page. هذه الآلية هو تصحيح خطأ القانون (مشروع رعاية الطفولة المبكرة) اختباري أن يوضع على كل صفحة. This checksum is in addition to the checksum present in previous versions of Exchange. هذا هو اختباري ، بالإضافة إلى اختباري الحالي في النسخ السابقة للتبادل.

Each Exchange page now has two checksums, one right after the other, at the beginning of each page. تبادل الصفحة الآن كل اثنين checksums ، حق واحد بعد الآخر ، في بداية كل صفحة. The first checksum (the data integrity checksum) determines whether the page has been damaged; the second checksum (the ECC checksum) can be used to automatically correct some kinds of random corruption. أول اختباري (سلامة البيانات اختباري) صفحة الذي يقرر ما إذا كان قد أصيب بأضرار ؛ اختباري الثاني (مشروع رعاية الطفولة المبكرة اختباري) يمكن استخدامها لتصحيح تلقائيا عشوائية من بعض أنواع الفساد. Before Exchange Server 2003 SP1, Exchange could reliably detect damage, but could not do anything about it. قبل تبادل خادم 2003 SP1 ، يمكن تبادل موثوق كشف الضرر ، ولكن لا يمكن القيام بأي شيء حيال ذلك.

By surveying many –1018 cases, Microsoft discovered that approximately 40 percent of –1018 errors are caused by a bit flip. من جانب المسح -1018 العديد من الحالات ، مايكروسوفت اكتشفت ان نحو 40 في المئة من الأخطاء -1018 تسببه الوجه قليلا. A bit flip occurs when a single bit on a page has the wrong value—a bit that should be a 1 flips to 0, or vice versa. وهناك بعض الشيء يحدث عندما نقف قليلا واحد على الصفحة خاطئة القيمة قليلا أن ينبغي أن يكون 1 إلى تقلب 0 ، أو العكس بالعكس. This is a common error with computer disks and memory. هذا هو خطأ مشترك مع أقراص الحاسوب والذاكرة.

The ECC checksum can correct a bit flip. مشروع رعاية الطفولة المبكرة اختباري يمكن تصحيح الوجه قليلا. This means that approximately 40 percent of –1018 errors are self-correcting if you are using Exchange Server 2003 SP1 or later. وهذا يعني أن حوالي 40 في المئة من الأخطاء -1018 التصحيح الذاتي إذا كنت تستخدم خادم تبادل 2003 SP1 أو في وقت لاحق.

Note: ECC checksums that can detect multiple bit flips are possible, but not practical to implement. ملاحظة : الرعاية في مرحلة الطفولة المبكرة checksums كشف المتعددة التي يمكن أن تقلب قليلا هي ممكن ، ولكن ليس من العملي تنفيذ. Single-bit error correction has minimal performance overhead, but it would be costly in terms of performance to detect and correct multiple bit errors. واحد بت تصحيح خطأ قد أداء الحد الأدنى من النفقات العامة ، لكنه سيكون مكلفا من حيث الأداء لكشف وتصحيح الأخطاء المتعددة قليلا. As a statistical matter, the distribution of page errors tends to cluster in two extremes: single bit errors and massive damage to the page. كما إحصائية المسألة ، والتوزيع صفحة من الأخطاء في المجموعة يميل إلى طرفي النقيض : واحد قليلا أخطاء وأضرار جسيمة على الصفحة.

If a –1018 error is corrected by the ECC mechanism, it does not mean you can safely ignore the error. وإذا -1018 هو تصحيح الخطأ من جانب آلية الرعاية في مرحلة الطفولة المبكرة ، لا يعني أنه يمكنك بأمان تجاهل هذا الخطأ. ECC correction does not change the fact that the underlying platform did not reliably store or retrieve data. الرعاية في مرحلة الطفولة المبكرة التصحيح لا يغير من حقيقة أن وراء منصة لم موثوق تخزين أو استرجاع البيانات. ECC makes recovery from error –1018 automatic (40 percent of the time), but does not change anything else about the way you should respond to a –1018 error. الرعاية في مرحلة الطفولة المبكرة يجعل الشفاء من خطأ -1018 التلقائي (40 في المئة من الوقت) ، لكنه لا يغير أي شيء آخر عن الطريقة التي يجب أن تستجيب لخطأ -1018.

The format of Exchange database page headers had to be changed to accommodate the ECC checksum. شكل تبادل قواعد البيانات صفحة رؤوس كان لا بد من تغيير لاستيعاب اختباري الرعاية في مرحلة الطفولة المبكرة. The field in each page header that used to carry the logical page number now carries the page number mixed with the ECC checksum. ميدان في رأس كل صفحة أن تستخدم لحمل عدد من المنطقي الآن صفحة ينطوي على صفحة عدد مختلطة مع الرعاية في مرحلة الطفولة المبكرة اختباري. This means that Exchange Server 2003 SP1 databases are not backward compatible, even with the Exchange Server 2003 original release. وهذا يعني أن تبادل خادم 2003 SP1 قواعد البيانات متوافقة وليس إلى الوراء ، حتى مع تبادل خادم 2003 الافراج عن الأصلية. The same applies to database tools, such as Eseutil. وينطبق الشيء نفسه على أدوات قاعدة البيانات ، مثل Eseutil. With older versions of the tools, the ECC databases appear to be massively corrupt, because the ECC checksum is not considered. مع الإصدارات القديمة من أدوات ، والرعاية في مرحلة الطفولة المبكرة وقواعد البيانات ويبدو أن الفاسدة اسع ، لأن الرعاية في مرحلة الطفولة المبكرة اختباري لا يعتبر.

For more information about ECC page correction, refer to the Microsoft Knowledge Base article لمزيد من المعلومات حول الرعاية في مرحلة الطفولة المبكرة صفحة تصحيح ، أن أشير إلى مقالة قاعدة معارف مايكروسوفت "New error correcting code is included in Exchange Server 2003 SP1" "رمز تصحيح الخطأ الجديدة المدرجة في البورصة هو ملقم 2003 SP1" [ http://support.microsoft.com/kb/867626 ] . [http://support.microsoft.com/kb/867626].

Backup and Error –1018 الاحتياطية وخطأ -1018

A –1018 error may be encountered at any time while the database is running. -1018 وهناك خطأ قد تواجهها في أي وقت ، في حين أن قاعدة البيانات هي على التوالي. However, this is not how the majority of –1018 problems are actually discovered. ولكن هذه ليست كيفية غالبية -1018 المشاكل اكتشفت فعلا. Instead, they are more often found during backup. وبدلا من ذلك ، فهي في أكثر الحالات وجدت أثناء النسخ الاحتياطي.

A –1018 error is reported only when a page is read, and not all pages in the database are likely to be read frequently. وهناك خطأ -1018 ويقال إلا عندما يقرأ صفحة ، وليس كل الصفحات في قاعدة البيانات ومن المرجح أن تكون قراءة في كثير من الأحيان. For example, messages in a user's Deleted Items folder may not be accessed for long periods. فعلى سبيل المثال ، في رسائل المستخدم المحذوفة مجلد قد لا يكون الحصول عليها لفترات طويلة. A –1018 error in such a location could go undetected for a long time. -1018 وهناك خطأ في مثل هذا الموقع يمكن أن تكتشفه لفترة طويلة. To detect –1018 problems quickly, you must read all the pages in the databases. -1018 لكشف المشاكل بسرعة ، يجب عليك قراءة جميع الصفحات في قواعد البيانات. Online backup is a natural opportunity for checking the entire database for –1018 damage, because to back up the whole database you have to read the whole database. عبر الإنترنت نتيجة طبيعية لفرصة لفحص كامل لقاعدة البيانات -1018 الضرر ، لأن لدعم كامل لقاعدة البيانات لديك قراءة لكامل قاعدة البيانات.

Exchange Online Streaming API Backups تبادل يتدفقون API النسخ الاحتياطي عبر الإنترنت

Exchange has always supported an online streaming backup application programming interface (API) that allows Exchange databases to be backed up while they are running. تبادل تؤيد دائما يتدفقون على الإنترنت دعم تطبيق واجهة برمجة (API) التي تتيح تبادل قواعد البيانات ليكون تدعمها أثناء وجودهم الحالي. Many third-party vendors have created Exchange-aware backup modules or agents that use this API. العديد من طرف ثالث البائعين خلقت الصرف علم وحدات احتياطية أو وكلاء أن استخدام هذا بمقياس معهد البترول الامريكي. Backup, the backup program that comes with Microsoft Windows Server™ 2003 or Windows® 2000 Server, supports the Exchange streaming backup API. الاحتياطية ، والاحتياطية التي تأتي مع برنامج مايكروسوفت ويندوز سيرفر ™ ® 2003 أو ويندوز 2000 الخادم ، وتدعم تبادل يتدفقون احتياطية بمقياس معهد البترول الامريكي. If you install Exchange Server or Exchange administrator programs on a computer, Backup is automatically enabled for Exchange-aware online backups. إذا قمت بتثبيت خادم تبادل أو تبادل مدير البرنامج على برامج الكومبيوتر ، والنسخ الاحتياطي تلقائيا مكن لعلم الصرف النسخ الاحتياطي عبر الإنترنت.

If a –1018 page is encountered during online backup, the backup will be stopped. وإذا -1018 صفحة على الانترنت هو واجهتها خلال الاحتياطية ، الاحتياطية سوف توقف. Exchange will not allow you to complete an online backup of a database with –1018 damage. تبادل لن تسمح لك لاستكمال الانترنت احتياطية من قاعدة بيانات مع -1018 الضرر. This is to ensure that your backup can never have a –1018 problem in it. هذا هو لضمان أن الاحتياطية الخاصة بك لا يمكن أبدا أن يكون له -1018 مشكلة في ذلك. This is important because it means you can recover from a –1018 problem by restoring from your last backup and bringing the database up-to-date with the subsequent transaction log files. وهذا أمر مهم لأنه يعني يمكنك من استعادة -1018 المشكلة من خلال استعادة الماضي من حسابك الاحتياطية وبذلك قاعدة بيانات مستكملة مع ملفات سجل المعاملات اللاحقة. After you do this, you will have a database that is up-to-date, with no data loss, and with no –1018 pages. بعد القيام بذلك ، سيكون لكم هو أن قاعدة بيانات مستكملة ، مع أي فقدان البيانات ، وليس مع صفحات -1018.

Playing transaction logs will never introduce a –1018 error into a database. اللعب صفقة الجذوع لن إدخال -1018 خطأ في قاعدة بيانات. However, playing transaction logs may uncover an already existing –1018 error. ومع ذلك ، يلعب صفقة الجذوع قد كشف قائم بالفعل خطأ -1018. To apply transaction log data, Exchange must read each destination page in the database. سجل المعاملات لتطبيق البيانات ، وتبادل كل المقصد يجب أن تقرأ صفحة في قاعدة البيانات. If a destination page is damaged, transaction log replay will fail. إذا كان المقصد صفحة معطل ، وسجل المعاملات اعادتها سوف تفشل. Exchange cannot replace a page with what is in the transaction log because transaction log updates may be done for only parts of a page. تبادل لا يمكن أن تحل محل صفحة مع ما هو في سجل المعاملات بسبب سجل المعاملات التحديثات ويمكن القيام به فقط لأجزاء من الصفحة.

If you restore from an online backup and encounter a –1018 error during transaction log replay, the most likely reason is that corruption was introduced into the database by hardware instability during or after restoration. إذا كنت من استعادة النسخ الاحتياطي عبر الإنترنت ويواجه -1018 خطأ خلال سجل المعاملات اعادتها ، على الأرجح السبب في ذلك هو أن الفساد كان عرض في قاعدة البيانات من جانب الأجهزة عدم الاستقرار أو أثناء أو بعد استعادة. To test this, restore the same backup to known good hardware. لاختبار هذا ، استعادة نفس الاحتياطية لأجهزة معروفة جيدا. For more information, see "Can Exchange Cause a –1018 Error?" لمزيد من المعلومات ، انظر : "هل يمكن ان يتسبب في تبادل -1018 خطأ؟" later in this document. في وقت لاحق من هذه الوثيقة.

Restoring from an online backup and replaying subsequent transaction logs is the standard strategy for recovering from –1018 errors. استعادة من النسخ الاحتياطي عبر الإنترنت والإعادة سجلات المعاملات اللاحقة هي استراتيجية موحدة ليتعافى من أخطاء -1018. Other strategies for special circumstances are outlined in "Recovering from a –1018 Error" later in this document. استراتيجيات أخرى لظروف خاصة ويرد في "يتعافى من خطأ -1018" في وقت لاحق من هذه الوثيقة.

Backup Retries and Transient –1018 Errors يحاول ثانية احتياطية وعابرة -1018 الأخطاء

Not all –1018 errors are permanent. لا -1018 جميع الأخطاء دائمة. A –1018 error may be reported because of a failure in memory or in a subsystem other than the disk. -1018 وهناك خطأ قد يكون وأفادت بسبب فشل في الذاكرة أو في نظام فرعي آخر غير القرص. The database page on the disk is good, but the system does not read the disk reliably. صفحة قاعدة البيانات على القرص جيدة ، ولكن النظام لا تجيد قراءة القرص موثوق. To handle such cases, and to give the backup a better chance to succeed even on failing hardware, Exchange has functionality to retry –1018 errors encountered during backup. للتعامل مع مثل هذه الحالات ، وإعطاء دعم أفضل فرصة للنجاح حتى على فشلها في المعدات ، وقد تبادل وظائف لإعادة المحاولة -1018 أخطاء واجهتها خلال الاحتياطية.

If a –1018 error is reported when a page is backed up, Exchange will wait a second or two, and then try again to read the page. وإذا -1018 خطأ عندما يقال صفحة هو تدعمها ، وسوف تبادل الانتظار ثانية أو اثنتين ، وبعد ذلك مرة أخرى في محاولة لقراءة صفحة. This will happen up to 16 times before Exchange gives up, fails the read of the page, and then fails the backup. هذا لن يحدث ما يصل إلى 16 مرة قبل ان يتخلى عن تبادل ، فإن لم تقرأ من الصفحة ، ثم فشل عملية النسخ الاحتياطي.

If Exchange eventually reads the page successfully, the copy of the page on the disk is good, but there is a serious problem elsewhere in the system. وإذا كان تبادل يقرأ صفحة في نهاية المطاف بنجاح ، نسخة من الصفحة على القرص جيدة ، ولكن هناك مشكلة خطيرة في أماكن أخرى في النظام. Even if Exchange is not successful in reading the page, it does not prove conclusively that the page is bad. وحتى لو تبادل هو لم تنجح في قراءة الصفحات ، أنه لا يثبت بشكل قاطع أن صفحة سيئة. Depending on how hardware caching has been implemented, all 16 read attempts may come from the same cache rather than directly from the disk. واعتمادا على كيفية معدات التخزين المؤقت قد نفذت ، 16 قراءة جميع المحاولات قد يأتي من نفس مخبأ بدلا من مباشرة من القرص. Exchange waits between each read attempt and tries to read again directly from the disk to increase the likelihood that the read will not be satisfied from cache. تبادل تنتظر بين كل قراءة ومحاولة يحاول مرة أخرى قراءة مباشرة من القرص لزيادة احتمال أن يقرأ لن ترضى عن مخبأ.

Exchange Volume Shadow Copy Service API Online Backups وبلغ حجم التبادل الظل نسخة الخدمة API النسخ الاحتياطي عبر الإنترنت

If you are running Exchange Server 2003 on Windows Server 2003, you have the additional online backup option of performing Volume Shadow Copy service-based online backups of Exchange. إذا كنت تعمل على تبادل خادم 2003 ويندوز سيرفر 2003 ، لديك خيار إضافي عبر الإنترنت من أداء وحجم الظل نسخ خدمة النسخ الاحتياطي على شبكة الإنترنت على أساس من التبادل. The Volume Shadow Copy service online backup API is a new method that is similar in its capabilities to the streaming backup API, but that can allow for faster restoration times independent of the database file size. حجم الظل نسخ احتياطية الخدمات الانترنت API هو جديد هو أن طريقة مماثلة في قدراتها الاحتياطية يتدفقون إلى المعهد ، ولكن التي يمكن أن تسمح لاستعادة أسرع الأوقات مستقلة من قاعدة البيانات حجم الملف. How fast Volume Shadow Copy service backup is compared to streaming backup depends on a number of factors, the most important of which is whether the Volume Shadow Copy provider is software-based or hardware-based. كيف بسرعة حجم الظل نسخ الخدمة الاحتياطية هو مقارنة يتدفقون احتياطية يعتمد على عدد من العوامل ، والتي من أهمها هو ما إذا كان حجم التداول هو توفير الظل نسخ البرمجيات أو الأجهزة القائمة على أساس. Both software-based and hardware-based providers can make snapshot and clone copies of files even when the files are locked open and in use. كلا البرمجيات والأجهزة القائمة على أساس مقدمي يمكن جعل لقطة واستنساخ نسخ من الملفات حتى عندما تكون الملفات مفتوحة وتخوض في استخدام. However, if you use a software provider, the process is no faster than when making an ordinary file copy. ولكن ، إذا كنت تستخدم البرمجيات ، فإن العملية لا أسرع من عند تقديم ملف نسخة عادية. To make the snapshot or clone process almost instantaneous, even for very large files, you must use a hardware provider. لجعل عملية استنساخ أو لقطة تقريبا فورية ، حتى بالنسبة للملفات كبيرة جدا ، يجب استخدام الأجهزة التي تقدم.

Backup for Windows 2003 includes a software-based generic Volume Shadow Copy service provider, but does not support Exchange-aware Volume Shadow Copy service backups. احتياطية لويندوز 2003 ويشمل البرمجيات القائمة على حجم الظل نسخ العامة للخدمات ، ولكن لا يدعم التبادل بين علم حجم التداول الظل نسخ خدمة النسخ الاحتياطي. If you are using any version of Backup for Windows as your Exchange backup application, you must perform streaming API online backups. إذا كنت تستخدم أي نسخة من النسخ الاحتياطي الخاصة بك ويندوز لتبادل احتياطية تطبيق ، يجب عليك أداء تدفق API النسخ الاحتياطي عبر الإنترنت.

An Exchange-aware Volume Shadow Copy service backup must complete in less than 20 seconds. تبادل بين تدرك حجم الظل نسخ الخدمة الاحتياطية ويجب الانتهاء منه في أقل من 20 ثانية. This is because Exchange suspends changes to the database files during the backup. وذلك لأن تبادل تعلق التغييرات إلى قاعدة بيانات الملفات أثناء النسخ الاحتياطي. If the snapshot or clone does not complete within 20 seconds, the backup fails. وإذا كان لقطة أو استنساخ لا كاملة في غضون 20 ثانية ، فشل النسخ الاحتياطي. Thus, a hardware provider is required because the backup must complete so quickly. وهكذا ، فإن المطلوب هو توفير المعدات لأن الاحتياطية يجب استكمال ذلك بسرعة.

Exchange has no opportunity to read database pages during a Volume Shadow Copy service backup. تبادل يست لديه الفرصة لقراءة صفحة قاعدة البيانات خلال حجم الظل نسخ الخدمة الاحتياطية. Therefore, the database cannot be checked for –1018 problems during backup. ولذلك ، فإن قاعدة البيانات لا يمكن التحقق من المشاكل خلال -1018 الاحتياطية. If you use a Volume Shadow Copy service-based Exchange backup solution, the vendor must verify the integrity of the backup in a separate operation soon after the backup has finished. إذا كنت تستخدم حجم الظل نسخ الخدمة على أساس تبادل حل احتياطي ، يجب أن البائع للتحقق من سلامة النسخة الاحتياطية في عملية منفصلة في وقت قريب بعد الانتهاء من احتياطي.

For more information about Volume Shadow Copy service backup and Exchange, see the Microsoft Knowledge Base article لمزيد من المعلومات عن حجم الظل نسخ احتياطية وتبادل الخدمات ، انظر مقالة قاعدة معارف مايكروسوفت "Exchange Server 2003 data backup and Volume Shadow Copy services" "إكستشينج سيرفر 2003 والبيانات الاحتياطية حجم الظل نسخ الخدمات" [ http://support.microsoft.com/kb/822896 ] . [http://support.microsoft.com/kb/822896].

Application Log Event IDs تطبيق سجل الحدث معرفات

When a –1018 error occurs, you will not see a –1018 event in the application log. عندما يحدث خطأ -1018 ، فلن ترى -1018 الحدث في سجل تطبيق. Instead, there are several different events that will report the –1018 as part of their Description fields. وبدلا من ذلك ، هناك العديد من مختلف الأحداث التي سوف يقدم تقريرا -1018 كجزء من الوصف المجالات. Which event is logged depends on the circumstances under which the –1018 problem was detected. الذي حدث هو تقم بتسجيل الدخول يتوقف على الظروف التي -1018 تم اكتشاف المشكلة.

This listing of events associated with error –1018 is not comprehensive, but it does include the core events for which you should monitor. هذه القائمة من الأحداث المرتبطة بها خطأ مع -1018 ليست شاملة ، ولكنها تشمل الأحداث الأساسية التي ينبغي أن ترصد لكم.

For all versions of Exchange, Microsoft Operations Manager (MOM) monitors for events 474, 475, and 476 from the event source Extensible Storage Engine (ESE). لجميع النسخ للتبادل ، مدير العمليات في شركة مايكروسوفت (وزارة القوى العاملة) لمراقبي الأحداث 474 ، 475 ، و 476 حالة من مصدر للإمتداد محرك التخزين (متغير). If you are running Exchange Server 2003 SP1, you should also ensure that event 399 is monitored. إذا كنت تعمل تبادل خادم 2003 SP1 ، يجب عليك أيضا ضمان أن يتم رصد 399 حالة.

Event 474 الحدث 474

For versions of Exchange prior to Exchange Server 2003 SP1, event 474 is logged when any checksum discrepancy is detected. لتبادل نسخ من قبل لتبادل خادم 2003 SP1 ، هو 474 حالة الدخول أي عندما اختباري اكتشاف التناقض. For Exchange Server 2003 SP1, this event is logged only when multiple bit errors exist on a page. لتبادل خادم 2003 SP1 ، هذا الحدث هو تقم بتسجيل الدخول فقط عندما المتعددة قليلا على وجود أخطاء في الصفحة. If a single bit error is detected, event 399 (discussed later in this document) is logged instead. إذا بت خطأ واحد هو اكتشاف 399 حالة (نوقشت في وقت لاحق من هذه الوثيقة) هو بدلا تقم بتسجيل الدخول.

Here is an example of a typical event 474: هنا هو مثال نموذجي الحدث 474 :

Event Type: Error نوع الحدث : خطأ
Event Source: ESE الحدث المصدر : متغير
Event ID: 474 معرف الحدث : 474

Description: Information Store (3500) First Storage Group: The database page read from the file "C:\mdbdata\priv1.edb" at offset 2121728 (0x0000000000206000) for 4096 (0x00001000) bytes failed verification due to a page checksum mismatch. الوصف : مخزن المعلومات (3500) تخزين المجموعة الأولى : قاعدة بيانات صفحة القراءة من الملف "جيم : \ mdbdata \ priv1.edb" في مقابلة 2121728 (0x0000000000206000) ل4096 (0x00001000) بايت فشل التحقق بسبب عدم التوافق الاختباري صفحة. The expected checksum was 1848886333 (0x6e33c43d) and the actual checksum was 1848886845 (0x6e33c63d). وقد كان من المتوقع اختباري 1848886333 (0x6e33c43d) والفعلية اختباري كان 1848886845 (0x6e33c63d). The read operation will fail with error –1018 (0xfffffc06). وقد سوف تفشل عملية قراءة مع الخطأ -1018 (0xfffffc06). If this condition persists then please restore the database from a previous backup. وإذا كان هذا الشرط قائما بعد ذلك يرجى استعادة قاعدة البيانات من الاحتياطية السابقة. This problem is likely due to faulty hardware. ومن المرجح أن هذه المشكلة بسبب خلل المعدات. Please contact your hardware vendor for further assistance diagnosing the problem. الرجاء الاتصال بائع معدات لمزيد من المساعدة في تشخيص المشكلة.

The Description field of this event provides information that can be useful for advanced troubleshooting and analysis. حقل الوصف من هذا الحدث وتوفر المعلومات التي يمكن أن تكون مفيدة لتحليل واستكشاف الأخطاء وإصلاحها المتقدمة. You should always preserve this information after a –1018 error has been reported. يجب عليك دائما ان نحافظ على هذه المعلومات بعد خطأ -1018 وقد أبلغ. Providing this information to hardware vendors or to Microsoft Product Support Services may be helpful when troubleshooting multiple –1018 errors. تقديم هذه المعلومات إلى الأجهزة أو البائعين للمنتجات مايكروسوفت لخدمات الدعم قد يكون من المفيد عند حل المشاكل المتعددة -1018 الأخطاء.

The Description field shows which database has been damaged and where the damage occurred. حقل الوصف وتبين قاعدة البيانات التي تضررت والتي وقع فيها الضرر. For translating a byte offset to a logical page number, recall the formula described in "Page Ordering" earlier in this document. أ بايت لترجمة منطقية لمعادلة رقم الصفحة ، ويذكر الصيغة الموصوفة في "الصفحة الأمر" في وقت سابق من هذه الوثيقة. Using that formula, you know that the page damaged in this error is logical page 517 because (2121728 ÷ 4096) – 1 = 517. باستخدام هذه الصيغة ، تعلمون أن صفحة تضررت في هذا الخطأ هو منطقي بسبب 517 صفحة (2121728 ÷ 4096) -- 1 = 517. Direct analysis of the page may show patterns that will help a hardware vendor determine the problem that caused the damage. تحليل مباشرة من صفحة قد تظهر أنماط من شأنها أن تساعد الأجهزة وهو بائع تحديد المشكلة التي تسببت في الضرر.

The description also lists the checksum that is written on the page as the expected checksum: 6e33c43d. الوصف أيضا قائمة اختباري ما يكتب على الصفحة حيث من المتوقع اختباري : 6e33c43d. The actual checksum is the checksum that Exchange calculates again as it reads the page: 6e33c63d. الاختباري الفعلي هو أن اختباري تبادل تحسب مرة أخرى لأنه يقرأ صفحة : 6e33c63d.

Why does it help to know what the checksum values are? ولماذا تساعد على معرفة ما هي القيم اختباري؟ Patterns in the checksum differences may assist in advanced troubleshooting. أنماط في اختباري الخلافات متقدمة يمكن أن تساعد في حل المشاكل. For an example of this, see "Appendix A: Case Studies" later in this document. للمثال على ذلك ، انظر "التذييل ألف : دراسات إفرادية" في وقت لاحق من هذه الوثيقة.

In addition, you can tell whether a particular –1018 error is the result of a single bit error (bit flip) by comparing the expected and actual checksums. وبالإضافة إلى ذلك ، يمكنك معرفة ما إذا كان على وجه الخصوص -1018 الخطأ هو نتيجة خطأ واحد قليلا (قليلا الوجه) بمقارنة checksums المتوقع والفعلي. To do this, translate the checksums to their binary numbering equivalents. إلى ذلك ، ترجمة لchecksums ترقيم ثنائي ما يعادلها. If the checksums are identical except for a single bit, the error on the page was caused by a bit flip. وإذا كان checksums متطابقة باستثناء واحد قليلا ، هذا الخطأ على الصفحة نجم عن الوجه قليلا.

The checksums listed in the preceding example can be translated to their binary equivalents using Calc.exe in its scientific mode: وقد checksums المدرجة في المثال السابق يمكن أن يترجم إلى ثنائي باستخدام المعادلات Calc.exe في طريقة علمية :

0x6e33c43d = 1101110001100111100010000111101 0x6e33c43d = 1101110001100111100010000111101

0x6e33c63d = 1101110001100111100011000111101 0x6e33c63d = 1101110001100111100011000111101

Single bit difference ^ بت واحد الفرق ^

In the preceding example, if this error had occurred on an Exchange Server 2003 SP1 database, the error would have been automatically corrected. في السابق على سبيل المثال ، إذا كان هذا خطأ وقعت على تبادل الخادم 2003 SP1 قاعدة البيانات ، هذا الخطأ قد صحح تلقائيا.

In Exchange Server 2003 SP1, the checksum reported in the Description field of event 474 shows the page integrity checksum and the ECC checksum together. في تبادل ملقم 2003 SP1 ، الاختباري في حقل الوصف من الحدث 474 صفحة ويبين سلامة اختباري والرعاية في مرحلة الطفولة المبكرة اختباري معا. For example: على سبيل المثال :

Description: Information Store (3000) SG1018: The database page read from the file "D:\exchsrvr\SG1018\priv1.edb" at offset 2371584 (0x0000000000243000) for 4096 (0x00001000) bytes failed verification due to a page checksum mismatch. الوصف : مخزن المعلومات (3000) SG1018 : قاعدة بيانات صفحة القراءة من الملف "دال : \ exchsrvr \ SG1018 \ priv1.edb" في مقابلة 2371584 (0x0000000000243000) ل4096 (0x00001000) بايت فشل التحقق بسبب عدم التوافق الاختباري صفحة. The expected checksum was 2484937984258 (0x0000024291d88902) and the actual checksum was 62488400759392765 (0x00de00de91d889fd). وقد كان من المتوقع اختباري 2484937984258 (0x0000024291d88902) والفعلية اختباري كان 62488400759392765 (0x00de00de91d889fd). The read operation will fail with error –1018 (0xfffffc06). وقد سوف تفشل عملية قراءة مع الخطأ -1018 (0xfffffc06). If this condition persists then please restore the database from a previous backup. وإذا كان هذا الشرط قائما بعد ذلك يرجى استعادة قاعدة البيانات من الاحتياطية السابقة. This problem is likely due to faulty hardware. ومن المرجح أن هذه المشكلة بسبب خلل المعدات. Please contact your hardware vendor for further assistance diagnosing the problem. الرجاء الاتصال بائع معدات لمزيد من المساعدة في تشخيص المشكلة.

Notice that the checksum listed is 16 hexadecimal characters, and in the previous example, the checksum is eight hexadecimal characters. ولاحظ ان اختباري المدرجة ستعشري هو 16 حرفا ، وفي المثال السابق ، هو اختباري ثمانية ستعشري حرفا. In the new checksum format, the first eight characters are the ECC checksum, and the last eight characters are the page integrity checksum. الجديد في شكل اختباري ، أول ثمانية أحرف هي الرعاية في مرحلة الطفولة المبكرة الاختباري ، وآخر من ثمانية أحرف هي صفحة سلامة اختباري.

Event 475 الحدث 475

Event 475 indicates a –1018 problem caused by a wrong page number. 475 حالة وجود مشكلة -1018 تسببه خطأ رقم الصفحة. It is no longer used in Exchange Server 2003. ومن لم تعد تستخدم في تبادل خادم 2003. Instead, bad checksums and wrong page numbers are reported together under Event 474. وبدلا من ذلك ، من الخطأ وسوء checksums أرقام الصفحات وترد معا في إطار الحدث 474. The following is an example of event 475: وفيما يلي مثال على هذا الحدث 475 :

Event Type: Error نوع الحدث : خطأ

Event Source: ESE الحدث المصدر : متغير

Event ID: 475 معرف الحدث : 475

Description: Information Store (1448) The database page read from the file "C:\MDBDATA\priv1.edb" at offset 1257906176 (0x000000004afa2000) for 4096 (0x00001000) bytes failed verification due to a page number mismatch. الوصف : مخزن المعلومات (1448) صفحة قاعدة البيانات القراءة من الملف "جيم : \ MDBDATA \ priv1.edb" في تعويض 1257906176 (0x000000004afa2000) ل4096 (0x00001000) بايت فشل التحقق نظرا لعدم تناسب عدد الصفحات. The expected page number was 307105 (0x0004afa1) and the actual page number was 307041 (0x0004afe1). المتوقعة الصفحة كان عدد 307105 (0x0004afa1) وكان العدد الفعلي للصفحة 307041 (0x0004afe1). The read operation will fail with error –1018 (0xfffffc06). وقد سوف تفشل عملية قراءة مع الخطأ -1018 (0xfffffc06). If this condition persists then please restore the database from a previous backup. وإذا كان هذا الشرط قائما بعد ذلك يرجى استعادة قاعدة البيانات من الاحتياطية السابقة.

Event 475 can be misleading. 475 حالة يمكن أن تكون مضللة. It may not mean the page is in the wrong location in the database. قد لا يعني الصفحة في موقع الخطأ في قاعدة البيانات. It only indicates that the page number field is wrong. إلا أنها تشير إلى أن عدد ميدان صفحة الخطأ. Only if the checksum on the page is also valid can you conclude that the page is in the wrong location. إلا إذا كان الاختباري على الصفحة هو صحيح أيضا يمكن لك أن تستنتج أن صفحة في موقع الخطأ. Advanced analysis of the actual page is required to determine whether the field is corrupted or the page is in the wrong place. تحليل متقدمة من الصفحة الفعلية المطلوبة لتحديد ما إذا كان الميدان هو فساد أو الصفحة في المكان الخطأ. In the majority of cases, the page field is corrupted. وفي غالبية الحالات ، صفحة الميدان هو فاسد.

Notice that in the preceding example, the difference in the page number fields is a single bit, indicating that this page is probably in the right place, but was damaged by a bit flip. ولاحظ ان في المثال السابق ، والفرق في عدد المجالات هو صفحة واحدة قليلا ، مشيرا إلى أن هذه الصفحة هي على الارجح في المكان المناسب ، ولكن أصيبت بأضرار من جراء الوجه قليلا.

Event 476 الحدث 476

Event 476 indicates error 1019 (JET_PageNotInitialized). 476 حالة تشير إلى خطأ 1019 (JET_PageNotInitialized). This error will occur if a page in the database is expected to be in use, but the page number is zero. سوف يحدث هذا الخطأ إذا صفحة في قاعدة البيانات ومن المتوقع أن يكون في الاستخدام ، ولكن عدد الصفحات هو الصفر.

In releases of Exchange prior to Exchange 2003 Service Pack 1, the first four bytes of each page store the checksum, and the next four bytes store the page number. في الإصدارات السابقة من تبادل لتبادل 2003 إلى حزمة الخدمات 1 ، الأربعة الأولى من كل صفحة بايت تخزين الاختباري ، والأربع المقبلة بايت تخزين رقم الصفحة. If the page number field is all zeroes, then the page is considered uninitialized. وإذا كان عدد صفحات كل ميدان هو أصفار ، ثم الصفحة التي تعتبر غير مهيأ. To make room for the ECC checksum in Exchange 2003 Service Pack 1, the page number field has been converted to the ECC checksum field. لإفساح المجال لتبادل الرعاية في مرحلة الطفولة المبكرة الاختباري في 2003 إلى حزمة الخدمات 1 ، صفحة عدد حولت الميدان إلى ميدان اختباري الرعاية في مرحلة الطفولة المبكرة. The page number is now calculated as part of the checksum data, and a page is now considered to be uninitialized if both the original checksum and ECC checksum fields are zeroed. الصفحة الآن عدد تحسب كجزء من اختباري للبيانات ، وصفحة تعتبر الآن غير مهيأ ليكون على حد سواء إذا الأصلية اختباري والرعاية في مرحلة الطفولة المبكرة وركزت اختباري الحقول.

Event Type: Error نوع الحدث : خطأ

Event Source: ESE الحدث المصدر : متغير

Event ID: 476 معرف الحدث : 476

Description: Information Store (3500) First Storage Group: The database page read from the file "C:\mdbdata\priv1.edb" at offset 2121728 (0x0000000000206000) for 4096 (0x00001000) bytes failed verification because it contains no page data. الوصف : مخزن المعلومات (3500) تخزين المجموعة الأولى : قاعدة بيانات صفحة القراءة من الملف "جيم : \ mdbdata \ priv1.edb" في مقابلة 2121728 (0x0000000000206000) ل4096 (0x00001000) بايت فشل التحقق لأنه لا يتضمن أي صفحة البيانات. The read operation will fail with error 1019 (0xfffffc05). وقد سوف تفشل عملية قراءة مع الخطأ 1019 (0xfffffc05). If this condition persists then please restore the database from a previous backup. وإذا كان هذا الشرط قائما بعد ذلك يرجى استعادة قاعدة البيانات من الاحتياطية السابقة. This problem is likely due to faulty hardware. ومن المرجح أن هذه المشكلة بسبب خلل المعدات. Please contact your hardware vendor for further assistance diagnosing the problem. الرجاء الاتصال بائع معدات لمزيد من المساعدة في تشخيص المشكلة.

In most cases, error 1019 is just a special case of error –1018. وفي معظم الحالات ، خطأ 1019 هو مجرد حالة خاصة من خطأ -1018. However, it could also be that a logical problem in the database has caused a table to show that an empty page is in use. ومع ذلك ، يمكن أيضا أن يكون نتيجة منطقية لمشكلة في قاعدة البيانات قد تسببت في الجدول إلى أن تظهر صفحة فارغة هو في استخدام. Because you cannot tell between these two cases without advanced logical analysis of the entire database, error 1019 is reported instead of error –1018. لأنك لا تستطيع أن تملي على اثنين من بين هذه الحالات دون تحليل منطقي متقدمة من قاعدة البيانات بأكملها ، ويقال خطأ 1019 بدلا من خطأ -1018.

Error 1019 is rare, and full discussion of analysis and troubleshooting this error is outside the scope of this paper. خطأ 1019 هو نادرة ، ومناقشة وافية للتحليل واستكشاف الأخطاء وإصلاحها هذا الخطأ هو خارج نطاق هذه الورقة.

Event 399 الحدث 399

Event 399 is a new event that was added in Exchange Server 2003 SP1. الحدث هو 399 حالة جديدة أضيفت في تبادل ملقم 2003 SP1. It is a Warning event, and not an Error event. وهو تحذير الحدث ، وليس خطأ في الحدث. It indicates that a single bit corruption has been detected and corrected in the database. وهو يشير إلى أن الفساد واحد قليلا تم اكتشاف وتصحيح في قاعدة البيانات.

Event Type: Warning نوع الحدث : تحذير

Event Source: ESE الحدث المصدر : متغير

Event ID: 399 معرف الحدث : 399

Description: Information Store (3000) First Storage Group: The database page read from the file "C:\mdbdata\priv1.edb" at offset 4980736 (0x00000000004c0000) for 4096 (0x00001000) bytes failed verification. الوصف : مخزن المعلومات (3000) تخزين المجموعة الأولى : قاعدة بيانات صفحة القراءة من الملف "جيم : \ mdbdata \ priv1.edb" في مقابلة 4980736 (0x00000000004c0000) ل4096 (0x00001000) بايت فشل التحقق. Bit 144 was corrupted and has been corrected. وكان 144 بت وفساد قد تم تصحيحه. This problem is likely due to faulty hardware and may continue. ومن المرجح أن هذه المشكلة بسبب خلل المعدات ويمكن أن تستمر. Transient failures such as these can be a precursor to a catastrophic failure in the storage subsystem containing this file. عابرة فشل مثل هذه يمكن أن تكون مقدمة لكارثة فشل في التخزين الفرعية التي تحتوي على هذا الملف. Please contact your hardware vendor for further assistance diagnosing the problem. الرجاء الاتصال بائع معدات لمزيد من المساعدة في تشخيص المشكلة.

Although Event 399 is a warning rather than an error, it should be monitored for and treated as seriously as any uncorrectable –1018 error. ورغم أن الحدث هو تحذير 399 بدلا من خطأ ، وأنه ينبغي لبرصد وتعامل بنفس الجدية غير قابل للتصحيح أي خطأ -1018. All –1018 errors indicate platform instability of one degree or another and may indicate additional errors will occur in the future. وتشير جميع -1018 الأخطاء منصة من عدم الاستقرار بدرجة أو بأخرى وتدل على أخطاء إضافية سوف يحدث في المستقبل.

Event 217 الحدث 217

Event 217 indicates backup failure because of a –1018 error. 217 حالة تشير إلى عدم الاحتياطية بسبب وجود خطأ -1018.

Event Type: Error نوع الحدث : خطأ

Event Source: ESE الحدث المصدر : متغير

Event ID: 217 معرف الحدث : 217

Description: Information Store (1224) First Storage Group: Error ( 1018) during backup of a database (file C:\mdbdata\priv1.edb). الوصف : مخزن المعلومات (1224) تخزين المجموعة الأولى : خطأ (1018) خلال احتياطية من قاعدة بيانات (ملف جيم : \ mdbdata \ priv1.edb). The database will be unable to restore. قاعدة البيانات لن تكون قادرة على استعادة.

Immediately before this error occurs, you will typically find a series of 16 event 474 errors in the application log, all for the same page. على الفور قبل أن يحدث هذا الخطأ ، فإنك سوف تجد عادة سلسلة من 16 حالة 474 الأخطاء في سجل تطبيق ، للجميع في الصفحة نفسها. During backup, Exchange will retry a page read 16 times, waiting a second or two between each attempt. وخلال الاحتياطية ، تبادل وسوف تعيد محاكمة قراءة الصفحة 16 مرة ، والانتظار ثانية أو اثنين من بين كل محاولة. This is done in case the error is transient, so that a backup has a better chance to succeed. وهذا ما يحدث في هذه الحالة هو خطأ عابر ، ذلك أن الاحتياط لديه أفضل فرصة للنجاح.

Retries are not done for normal run-time read errors, but only during backup. يحاول ثانية لم ينفذ بشكل طبيعي لزمن التشغيل قراءة الأخطاء ، ولكن فقط خلال الاحتياطية. Performing retries during normal operation could stall the database, if a frequently accessed page is involved. المنفذ يحاول ثانية أثناء التشغيل العادي يمكن أن المماطلة في قاعدة البيانات ، إذا كان في كثير من الأحيان الوصول إلى الصفحة المعنية.

Event 221 الحدث 221

Event 221 indicates backup success. 221 حالة تشير إلى نجاح الاحتياطية. It is generated for each database file individually when it is backed up. ومن ولدت لقاعدة البيانات كل ملف على حدة عندما يكون تدعمها.

Event Type: Information نوع الحدث : معلومات

Event Source: ESE الحدث المصدر : متغير

Event ID: 221 معرف الحدث : 221

Description: Information Store (1224) First Storage Group: Ending the backup of the file C:\mdbdata\priv1.edb. الوصف : مخزن المعلومات (1224) تخزين المجموعة الأولى : إنهاء النسخة الاحتياطية للملف جيم : \ mdbdata \ priv1.edb.

---------- ----------

Event Type: Information نوع الحدث : معلومات

Event Source: ESE الحدث المصدر : متغير

Event ID: 221 معرف الحدث : 221

Description: Information Store (1224) First Storage Group: Ending the backup of the file D:\mdbdata\priv1.stm. الوصف : مخزن المعلومات (1224) تخزين المجموعة الأولى : إنهاء النسخة الاحتياطية للملف دال : \ mdbdata \ priv1.stm.

If you are using third-party backup applications, there may be additional backup events that you should monitor in addition to those listed here. إذا كنت تستخدم طرف ثالث الطلبات الاحتياطية ، قد يكون هناك الإضافية الاحتياطية لكم الأحداث التي ينبغي أن ترصد بالإضافة إلى تلك المدرجة هنا.

Root Causes الأسباب الجذرية

At the simplest level, there are only three root causes for –1018 errors: في أبسط مستوى ، وهناك ثلاث دول فقط للأسباب الجذرية للأخطاء -1018 :

  • The underlying platform for your Exchange database has failed to reliably write Exchange data to storage. الكامنة ومنصة لتبادل قواعد البيانات الخاصة بك قد فشل في كتابة موثوق لتبادل البيانات وتخزينها.

  • The underlying platform for your Exchange database has failed to reliably read Exchange data from storage. الكامنة ومنصة لتبادل قواعد البيانات الخاصة بك قد فشلت في قراءة موثوق بها من تبادل البيانات وتخزينها.

  • The underlying platform for your Exchange database has failed to reliably preserve Exchange data while in storage. الكامنة ومنصة لتبادل قواعد البيانات الخاصة بك قد فشل في الحفاظ على تبادل موثوق في حين أن تخزين البيانات.

This level of analysis defines the scope of the issue. هذا المستوى من التحليل يحدد نطاق هذه المسألة. At a practical level, you want to know: على المستوى العملي ، أنك تريد أن تعرف :

  • Will this happen again? سوف يحدث هذا مرة أخرى؟

  • What should I do to recover from the error? ماذا علي أن أفعل لاسترداد من خطأ؟

How Microsoft IT assesses the likelihood of additional errors is described in "Server Assessment and Root Cause Analysis" later in this document. Recovery strategies are also described later in this document. This section summarizes the most common root causes for error –1018:

  • Failing disk drives . Along with simple drive failures, it is not uncommon for Microsoft Product Support Services to handle cases where rebuilding a redundant array of independent disks (RAID) drive set after a drive failure is not successful.

  • Hard failures . Sudden interruption of power to the server or the disk subsystem may result in corruption or loss of recently changed data. Enterprise class server and storage systems should be able to handle sudden loss of power without corruption of data. Microsoft has tested Exchange and Exchange servers by unplugging a test server thousands of times in succession, with no corruption of Exchange data afterward.

    Exchange is an application that is well suited to uncovering problems from such testing because of its transaction log replay behavior and its checksum function. Damage to Exchange files often becomes evident during post-failure transaction log replay and recovery, or through verifying checksums on the database files after a test pass.

    For more information about input/output (I/O) atomicity, and its importance for data integrity after a hard failure, refer to "Best Practices" later in this document.

  • Cluster failovers . As an application is transitioned from one cluster node to another, disk I/O may be lost or not properly queued during the transition. Even though individual components may be robust and well designed, they may not work well together as a cluster system. This is one reason that Microsoft has a qualification program for cluster systems that is separate from the qualification program for stand-alone components. The cluster system qualification program tests all critical components together rather than separately.

  • Resets and other events in the disk subsystem . Companies are increasingly implementing Storage Area Network (SAN) and other centralized storage technologies, in which multiple servers access a shared storage frame. Not only is correct configuration and isolation of disk resources essential in these environments, but you must also manage redundant I/O paths and an increasing number of filters and services that are involved in disk I/O. The increasing complexity of the I/O chain necessarily introduces additional points of failure and exposes poor product integration.

  • Hardware or firmware bugs . Standard diagnostic test runs are seldom successful in diagnosing these problems. (If the standard diagnostic run could catch this particular problem, would it not already have been caught?) Understanding these issues frequently requires correlating data from multiple servers and using specialized diagnostic suites and stress test harnesses.

This is not a comprehensive list of all causes of error –1018, but it does outline the problem categories that account for the majority of these errors.

Can Exchange Cause a –1018 Error?

Can Exchange be the root cause of a –1018 error? Exchange might be responsible for creating a –1018 condition if it did one or both of the following:

  • Constructed the wrong checksum for a page.

  • Constructed a page correctly, but instructed the operating system to write the page in the wrong location.

The Exchange mechanisms for generating checksums and writing pages back to the database files are based on simple algorithms that have been stable since the first Exchange release. Even the addition of the ECC checksum in Exchange Server 2003 SP1 did not fundamentally alter the page integrity checksum mechanism. The ECC checksum is an additional checksum placed next to the original corruption detection checksum. The integrity of Exchange database pages is still verified through the original checksum.

Note: If you use versions of Esefile or Eseutil from versions of Exchange prior to Exchange Server 2003 SP1 to verify checksums in an Exchange Server 2003 SP1 or later database, nearly every page of the database will be reported as damaged. The page format was altered in Exchange Server 2003 SP1 and previous tools cannot read the page correctly. You must use post-Exchange Server 2003 SP1 tools to verify ECC checksum pages.

A logical error in the page integrity checksum mechanism would likely result in reports of massive and immediate corruption of the database, rather than in infrequent and seemingly random page errors.

This does not mean that there have never been any problems in Exchange that have resulted in logical data corruption. However, these problems cause different errors and not a –1018 error. Error –1018 is deliberately scoped to detect logically random corruptions caused by the underlying platform.

There are a few cases where false positive or false negative –1018 reports have been caused by a problem in Exchange. In these cases, the checksum mechanism worked correctly, but there was a problem in a code path for reporting an error. This caused a –1018 error to be reported when there was no problem, or an error to not be reported that should have been. Examination of the affected databases quickly leads to resolution of such issues.

The Exchange transaction log file replay capability is another capability that allows Microsoft to effectively diagnose –1018 errors that may be the fault of Exchange. Recall from the previous section that online backups are not allowed to complete if –1018 problems exist in the database. In addition, after restoration of a backup, transaction log replay re-creates every change that happened subsequent to the backup. This allows Exchange development to start from a known good copy of the database and trace every change to it.

As an Exchange administrator, the following two symptoms indicate that Exchange should be looked at more closely as the possible cause of a –1018 error:

  • After restoration of an online backup, and before transaction log file replay begins, there is a –1018 error in the restored database files. This could indicate that checksum verification failed to work correctly during backup. It is also possible that the backup media has gone bad, or that data was damaged while being restored because of failing hardware. The next test is more conclusive.

  • After checksum verification of restored databases, a –1018 error is present after successful transaction log replay has completed. This could indicate that a logical problem resulted in generation of an incorrect checksum. Reproducing this problem consistently on different hardware will rule out the possibility that failing hardware further damaged the files during the restoration and replay process.

Conversely, if restoring from the backup and rolling forward the log files eliminate a –1018 error, this is a strong indication that damage to the database was caused by an external problem.

In summary, error –1018 is scoped to report only two specific types of data corruption:

  • A logical page number recorded on a page is nonzero and does not match the physical location of the page in the database.

  • The checksum recorded on a page does not match the actual data recorded on the page.

Exchange thus detects both corruption of the data on a page and guards against the possibility that a page in the database has been written in the wrong place.

How Microsoft IT Responds To Error 1018

Microsoft IT uses Microsoft Operations Manager (MOM) 2005 to monitor the health and performance of Microsoft Exchange servers. MOM sends alerts to operator consoles for critical errors, including error –1018.

MOM provides enterprise-class operations management to improve the efficiency of IT operations. You can learn more about MOM at the Microsoft Windows Server System Web site [ http://www.microsoft.com/mom/default.mspx ] .

At Microsoft, automatic e-mail notifications are sent to a select group of hardware analysts whenever a –1018 occurs. Thus, all –1018 errors are investigated by an experienced group of people who track the errors over time and across all servers. As you will see later in this document, this approach is an important part of the methodology at Microsoft for handling –1018 errors.

Monitoring Backup Success

Every organization, regardless of size, should monitor Exchange servers for error –1018. The most basic way to accomplish this, if your organization does not use a monitoring application such as MOM, is to verify the success of each Exchange online backup. Even if you do use MOM, you should still monitor backup success separately.

If Exchange online backups are failing unnoticed, you are at risk on at least these counts:

  • A common reason for backup failure is that the database has been damaged . Thus, the Exchange platform may be at risk of sudden failure.

  • > You do not have a recent known good backup of critical Exchange data . While an older backup can be rolled forward with additional transaction logs for zero loss restoration, the older the backup, the less likely this will be successful, for a number of operational reasons. For example, an older backup tape may be inadvertently recycled. In addition, if the platform issues on the Exchange server result in loss of the transaction logs, rolling forward will be impossible.

  • > After successful completion of an online backup, excess transaction logs are automatically removed from the disk . With backups not completing, transaction log files will remain on the disk, and you are at risk for eventually running out of disk space on the transaction log drive. This will force dismount of all databases in the affected storage group. (If a transaction log drive becomes full, do not simply delete all the log files. Instead, refer to the Microsoft Knowledge Base article "How to Tell Which Transaction Log Files Can Be Safely Removed" [ http://support.microsoft.com/kb/240145 ] .

Verifying backup success is arguably the single most important monitoring task for an Exchange administrator.

As a best practice, Microsoft IT not only sets notifications and alerts for backup errors and failures, but also for backup successes. A daily report for each database is generated and reviewed by management. This review ensures that there is positive confirmation that each database has actually been backed up recently, and that there is immediate attention to each backup failure.

Securing Data after a –1018 Error

The most common way that a –1018 error comes to the attention of Microsoft IT analysts is through a backup failure. While a –1018 error may occur during normal database operation, normal run-time –1018 errors are less frequent than errors during backup.

Note: Exchange databases perform several self-maintenance tasks on a regular schedule (which can be set by the administrator). One of these tasks, called online defragmentation, consolidates and moves pages within the database for better efficiency. Thus, error –1018 may be reported more frequently during the online maintenance window than during normal run time.

This is the general process that occurs at Microsoft after a –1018 error:

  • MOM alerts are generated and e-mail notification is sent to Exchange analysts.

  • Verification is done that recent good backups exist for all databases on the server.

    It is important that backups are good for all databases, and not just the database affected by the –1018, because the error indicates that the entire server is at risk.

  • All transaction log files on the server are copied to a remote location, in case there is a failure of the transaction log drive. As the investigation proceeds, new log files are periodically copied to a safe location or backed up incrementally.

    You can copy log files to a safe location by doing an incremental or differential online backup. In Exchange backup terminology, an incremental or differential backup is one that backs up only transaction log files and not database files. An incremental backup removes logs from the disk after copying them to the backup media. A differential backup leaves logs on the disk after copying them to the backup media.

After existing Exchange data has been verified to be recoverable and safe, it is time to begin assessing the server and performing root cause analysis.

Server Assessment and Root Cause Analysis

There are two levels at which you must gauge the seriousness of a –1018 error:

  • The immediate impact of the error on the functioning of the database.

  • The likelihood of additional and escalating failures.

These two factors are independent of each other. Ignoring a –1018 error because the damaged page is not an important one is a mistake. The next page destroyed may be critical and may result in a sudden catastrophic failure of the database.

There are two common analysis and recovery scenarios for a –1018 condition:

  • There is only a single error, and little or no immediate impact on the overall functioning of the server. You have time to do careful diagnosis, and plan and schedule a low-impact recovery strategy. However, root cause analysis is likely to be difficult because the server is not showing obvious signs of failure beyond the presence of the error.

  • There are multiple damaged pages or the error occurs in conjunction with other significant failures on the server. You are in an emergency recovery situation.

In the majority of emergency recovery situations, root cause analysis is simple because there is a strong likelihood that the –1018 was caused by a catastrophic or obvious hardware failure. Even in an emergency situation, you should take the time to preserve basic information about the error that is needed for statistical trending across servers. For more information, see "Appendix B: –1018 Record Keeping" later in this document.

Even before root cause analysis, your first priority should be to make sure that existing data has already been backed up and that current transaction log files are protected. Then you can begin analysis with bookending.

Bookending

The point at which a page was actually damaged and the point at which a –1018 was reported may be far apart in time. This is because a –1018 error will only be reported when a page is read by the database. Bookending is the process of bracketing the range of time in which the damage must have occurred.

The beginning bookend is the time the last good online backup was done of the database (marked by event 221). Because the entire database is checked for –1018 problems during backup, you know that the problem did not occur until after the backup occurred. The other bookend is the time at which the –1018 error was actually reported in the application log. Frequently, this will be a backup failure error (event 217). The event that caused the –1018 error must have occurred between these two points in time.

After you have established your bookends, the next task is to look for what else happened on the server during this time that may be responsible for the –1018 error:

  • Was there a hard server or disk failure?

  • Was the server restarted (event 6008 in the system log)?

  • Were Exchange services stopped and restarted?

  • Have there been any storage-related errors? This includes memory, disk, multipath, and controller errors. Not only should you search the Windows system log, but you should also be aware of other logging mechanisms that may be used by the disk system. Many external storage controllers do not log events to the Windows system or application logs, and, by default, the controller may not be set up to log errors. You must ensure that error logging is enabled and that you can locate and interpret the logs.

  • Did Chkdsk run against any of the volumes holding Exchange data?

  • If this is a clustered server, were there any failovers or other cluster state transitions?

  • Have any hardware changes been made, or has firmware or software been upgraded on the server?

Any unusual event that occurred between the bookend times must be considered suspect. If there are no unusual events that can account for damage to the database files, you must consider the possibility that there is an undetected problem with the reliability of the underlying platform.

It is also possible that the error is due to a transient condition external to the hardware. A variety of environmental factors can corrupt computer data or cause transient malfunctions. Vibration, heat, power fluctuations, and even cosmic rays are known to cause glitches or even permanent damage. Hard drive manufacturers are well aware that normally functioning drives are not 100 percent reliable in their ability to read, write, and store data, and design their systems to detect errors and rewrite corrupted data.

Keeping in mind that no computer storage system is 100 percent reliable, how can you decide whether a –1018 is indicative of an underlying problem that you should address, or is just a random occurrence that you should accept?

A Microsoft Senior Storage Technologist who has extensive experience in root cause analysis of disk failures and Exchange –1018 errors, suggests this principle: For 100 Exchange servers running on similar hardware, you should experience no more than a single –1018 error in a year. The phrase running on similar hardware is important in understanding the proper application of this principle.

Standardizing on a single hardware platform for Exchange is useful in root cause analysis of 1018 errors. In the absence of an obvious root cause, the next step of investigation is to look for patterns of errors across similar servers.

A single –1018 error on a single page may be a random event. Only after another –1018 error occurs on the same or a similar server do you have enough information to begin looking for a trend or common cause. If a –1018 error occurs on two servers that have nothing in common, you have two errors that have nothing in common rather than two data points that may reveal a pattern.

As a general rule, if you average less than one –1018 error across 100 servers of the same type per year, it is unlikely that root cause analysis will reveal an actionable problem.

This does not mean that you should not record data for each –1018 error that occurs on a server. Until a second error has occurred, you cannot know whether a particular error falls below the threshold of this principle.

If a –1018 error is caused by a subtle hardware problem, providing data from multiple errors can be critical. With only a single error to consider, it is likely to be difficult for Microsoft or a hardware vendor to identify a root cause beyond what you can identify on your own. For two actual –1018 root cause investigations, and examples of how difficult and subtle some issues can be to analyze, see "Appendix A: Case Studies" later in this document.

Detailed information about every –1018 error that happens at Microsoft is logged into a spreadsheet as described in "Appendix B: –1018 Record Keeping " later in this document.

Verifying the Extent of Damage

Error –1018 applies to problems on individual pages in the database, and not to the database as a whole. When a –1018 error is reported, you cannot assume that the reported page is the only one damaged. Because a backup will stop at the moment the first –1018 is encountered, you cannot even rely on errors reported during the backup to show you the full extent of the damage.

You need to know how many pages are damaged in the database as part of deciding on a recovery strategy. If multiple pages are damaged, multiple errors have likely occurred, and the platform should be considered in imminent danger of complete failure.

In the majority of –1018 cases investigated by Microsoft IT, there is only a single damaged page in the database. In this circumstance, absent other indications of an underlying problem, Microsoft IT will leave the server in service and wait to implement recovery until a convenient off-peak time. The assumption is that this is a random error, unless a second error in the near future or similar issues on other servers indicate a trend.

Note: Remember that an error –1018 prevents an online backup from completing. Delaying recovery of the database will require you to recover with an increasingly out-of-date backup. This situation will definitely result in longer downtime during recovery, because of additional transaction log files that must be replayed. In Exchange Server 2003 SP1, the typical performance of log file replay is better than 1,000 log files per hour with that performance remaining consistent, regardless of the number of log files that must be replayed. In prior versions of Exchange, transaction log file replay can be more than five times slower, with the average speed of replay tending to diminish as more logs are replayed.

Comprehensively testing an entire database for –1018 pages requires taking the database offline and running Eseutil in checksum mode.

If you bring a database down after a –1018 error has occurred, there is some chance that it will not start again. If other, unknown pages have also been damaged, one of them could be critical to the startup of the database. Statistically, this is a low probability risk, and Microsoft IT does not hesitate to dismount databases that have displayed run-time –1018 errors.

Eseutil is installed in the \exchsrvr\bin folder of every Exchange server. When run in checksum mode (with the /K switch), Eseutil rapidly scans every page in the database and verifies whether the page is in the right place and whether its checksum is correct. Eseutil runs as rapidly as possible without regard to other applications running on the server. Running Eseutil /K on a database drive shared with other databases is likely to adversely affect the performance of other running databases. Therefore, you should schedule testing of a database for off-peak hours whenever possible.

Note: If you decide to copy Exchange databases to different hardware to safeguard them, be sure that you copy them rather than move them. The problems on the current platform may not be in the disk system, but may cause corruption to occur during the move process. If you move the files, you get no second chance if this corruption happens.

At Microsoft, Eseutil checksum verification is done by running multiple copies of Eseutil simultaneously against the database. One instance of Eseutil /K is started against the database, and after a minute, another instance is started against the same database. The reason for doing this is that in a mirror set, one side of the mirror may have a bad page, but the other side may not.

Running two copies of Eseutil slightly out of synch with each other makes it much more likely that both sides of a mirror will be read. It is not often that one side of a mirror is good and one side is bad, but it does happen, and a thorough test requires testing both sides of the mirror. At Microsoft, this Eseutil regimen is also run five times in succession, to further increase the confidence level in the results.

Note: Multiple runs of Eseutil /K are unnecessary if databases are stored on a RAID-5 stripe set, where data is striped with parity across multiple disks. This is because there is only one copy of a particular page in the set, with redundancy being achieved by the ability to rebuild the contents of a lost drive from the parity. Also, note that as a general rule, RAID 1 (Mirroring) or RAID 1+0 (Mirroring+Striping) drive sets are recommended for heavily loaded database drives for performance reasons.

Recovering from a –1018 Error

Microsoft IT undertakes two fundamental tasks to recover from a –1018 error:

  • Correct the root problem that caused the error.

  • Recover Exchange data damaged by the error.

These tasks are not completely independent of each other. What is discovered about the root cause may influence the data recovery strategy.

For example, if there are overt signs that server hardware is in imminent danger of complete failure, the data recovery strategy may require immediate data migration to a different server. If the server appears to be otherwise stable, data recovery may consist merely of restoring from backup, to remove the bad page from the database.

Server Recovery

At Microsoft, a single –1018 error puts a server on a watch list. It does not trigger replacement or retirement of the hardware unless there has been positive identification of the component that caused the error. If additional –1018 errors occur on the same server in the near future, regardless of whether the root cause has been specifically determined, the server is treated as untrustworthy. It is taken out of production and extensive testing is done.

It may seem obvious that after any –1018 error occurs, you should immediately take the server down and run a complete suite of manufacturer diagnostics. Yet this is not something that Microsoft IT does as a matter of course. The reason is that standard diagnostic tests are seldom successful in uncovering the root cause of a –1018 error. This is because:

  • The corruption may be an anomaly . Power fluctuations and interference, temporary rises in heat or humidity, and even cosmic rays can corrupt computer data. Unless these conditions are repeated at the time the test is run, the test will show nothing.

  • If a –1018 error occurs only once and is not accompanied by any other visible errors or issues, it is probable that the server is currently functioning normally . The condition that caused the problem may occur infrequently or require a particular confluence of circumstances that cannot be replicated by a general diagnostic tool.

  • Hardware frequently fails in an intermittent rather than steady or progressive pattern .

  • The problem may be the result of a subtle hardware or firmware bug rather than due to a progressively failing component . In this case, ordinary manufacturer diagnostics may be incapable of uncovering the issue. If these diagnostics could detect the issue, it would have already been uncovered in a previous diagnostic run.

  • The problem may be a Heisenberg . The term Heisenberg refers to a problem that cannot be reproduced because the diagnostic tools used to observe the system change the system enough that the problem no longer occurs. For example, a tool that monitors the contents of RAM may slow down processing enough that timing tolerances are no longer exceeded, and the problem disappears.

  • The diagnostic tool may not be able to simulate a load against the server that is sufficiently complex . There is a misconception that –1018 errors are more likely to appear when you place a system under a heavy I/O load. The experience at Microsoft is that the complexity of the load is more relevant to exposing a data corruption issue than is the overall level of load. Complexity can be in the type of access (the I/O size combined with direction), as well as in the actual data content patterns. Certain complex patterns can show noise or crosstalk problems that will not be exposed by simpler patterns. One of the strengths of the Finisar Medusa Labs Test Tool Suite is its ability to generate such patterns.

Manufacturer diagnostics are typically run only after the server has already been taken out of production. This happens after a pattern of –1018 errors has established that an underlying problem exists, but the root cause has not yet been discovered. Along with these diagnostics, Microsoft IT also tries to reproduce data corruption problems by using tools that stress the disk subsystem.

The Jetstress (Jetstress.exe) and Exchange Server Load Simulator (LoadSim) tools can be used to realistically simulate the I/O load demands of an actual Exchange server. The primary function of these tools is for capacity planning and validation, but they are also useful for testing hardware capabilities.

Jetstress creates several Exchange databases and then exercises the databases with realistic Exchange database I/O requests. This approach allows determining whether the I/O bandwidth of the disk system is sufficient for its intended use.

LoadSim simulates Messaging application programming interface (MAPI) client (Microsoft Office Outlook 2003) activity against an Exchange server and is useful for judging the overall performance of the server and network. LoadSim requires additional client workstation computers to present high levels of client load to the server.

While neither tool is intended as a disk diagnostic tool, both can be used to create large amounts of realistic Exchange disk I/O. For this purpose, most people prefer Jetstress because it is simpler to set up and tune. Both Jetstress and LoadSim come with extensive documentation and setup guidance and are available free for download from Microsoft. You can download Jetstress [ http://www.microsoft.com/downloads/details.aspx?FamilyId=94B9810B-670E-433A-B5EF-B47054595E9C&displaylang=en ] from the Microsoft Download Center. You can download LoadSim from the Microsoft Windows Server System Web site [ http://www.microsoft.com/exchange/downloads/2000/loadsim.mspx ] .

Microsoft IT also uses the Medusa Labs Test Tools Suite from Finisar for advanced stress testing of disk systems. The Finisar tools can generate complex and specific I/O patterns, and are designed for testing the reliability of enterprise-class systems and storage. While Jetstress and LoadSim are capable of generating realistic Exchange server loads, the Finisar tools generate more complex and demanding I/O patterns that can uncover subtle data and signal integrity issues.

For detailed information about the Medusa Labs Test Tools Suite, see the Finisar Web site [ http://www.finisar.com/nt/Medusalabs.php ] .

Use of Jetstress, LoadSim, or the Medusa tools requires that the server be taken out of production service. Each of these tools, used in a stress test configuration, makes the server unusable for other purposes while the tests are running.

The Eseutil checksum function is also sometimes useful in reproducing unreliability in the disk system. Eseutil scans through a database as quickly as possible, reading each page and calculating the checksum that should be on it. It will use all the disk I/O bandwidth available. This puts significant I/O load on the server, although not a particularly complex load. If successive Eseutil runs report different damage to pages, this indicates unreliability in the disk system. This is a simple test to uncover relatively obvious problems. A disk system that fails this test should not be relied on to host Exchange data in production. However, the Eseutil checksum function is unlikely to reveal subtle problems in the system.

Another test that is frequently done is to copy a large checksum-verified file (such as an Exchange database) from one disk to another. If the file copy fails with errors, or the copied file is not identical to the source, this is a strong indication of serious disk-related problems on the server.

As a final note about server recovery, you should verify that the Exchange server and disk subsystem are running with the latest firmware and drivers recommended by the manufacturers. If they are not, it is possible that upgrading will resolve the underlying problem.

Microsoft works closely with manufacturers when –1018 patterns are correlated with particular components or configurations, and hardware manufacturers are continually improving and upgrading their systems. In rare cases, you may discover that –1018 errors begin occurring soon after application of a new driver or upgrade. This is another case where a standardized hardware platform can make troubleshooting and recognizing patterns easier.

Data Recovery

The first—if somewhat obvious—question to answer when deciding on a data recovery strategy is this: Is the database still running?

If the database is running, you know that the error has not damaged parts of the database critical to its continuing operation. While some user data may have already been lost, it is likely that the scope of the loss is limited.

The next question is: Do you believe the server is still reliable enough to remain in production?

At Microsoft, if a single –1018 occurs on a server but there is no other indication of system instability, the server is deemed healthy enough to remain in production indefinitely. This conclusion is subject to the appearance of additional errors.

Before deciding on a data recovery strategy, you must assess the urgency with which the strategy must be executed. Along with the current state of the database, what you have learned already from the root cause analysis will factor heavily into this assessment. The following questions must be considered:

  • Has more than one error occurred? If multiple errors have occurred, or additional errors are occurring during your troubleshooting, you should consider it highly likely that the entire platform may suddenly fail.

  • Is more than one database involved?

  • Is the platform obviously unstable? For example, suppose that you find during root cause analysis that you cannot copy large files to the affected disk without errors during the copy. It becomes much more urgent at this point to move the databases to a different platform immediately.

  • Is there a recent backup of the affected data? If you have not been monitoring backup success, backups may have been failing for days or weeks because the database was already damaged. You are at even greater risk if there is a sudden failure of the server.

If you do not have a good, recent online backup, you must make it a high priority to shut down the databases and copy the database files from the server to a safe location. If you do not have a recent online backup, and if you do not make an offline backup, you run the risk that subsequent damage to the database will make it irreparable and result in catastrophic data loss.

While it is true that the database is already damaged, it can be repaired with Eseutil, as long as the damage does not become too extensive. More detail about repairing the database is provided later in this document.

Microsoft IT chooses from several standard strategies to recover a database after a –1018 error occurs. The next sections outline the advantages and disadvantages of each strategy, along with the preconditions required to use the strategy.

Restore from Backup

Restoring from a known good backup and rolling the database forward is the only strategy guaranteed to result in zero data loss regardless of how many database pages have been damaged. This strategy requires the availability and integrity of all transaction logs from the time of backup to the present.

The reason that this strategy results in zero data loss is that after Exchange detects non-transient –1018 damage on a page, the page is never again used or updated. One of two conditions applies: either the backup copy of the database already carries the most current version of the page, or one of the transaction logs after the point of backup carries the last change made to the page before it was damaged. Thus, restoring and rolling forward expunges the bad page with no data loss.

Note: Before restoring from a backup, you should always make a copy of the current database. Even if the database is damaged, it may be repairable. If you restore from a backup, the current database will be overwritten at the beginning of the restoration process. If restoration fails, and you have a copy of the damaged database, you can then fall back on repairing the database as your recovery strategy.

Restoration from a backup is the method used the majority of time by Microsoft IT to recover from a –1018 error. Each Exchange database at Microsoft is sized so that it can be restored in about an hour.

Restoration is also much faster than other recovery strategies. Assuming that the server is deemed stable enough, restoration is scheduled for an off-peak time, and results in minimal disruption for end users. For more information about how Microsoft backs up Exchange, refer to the IT Showcase paper Backup Process Used with Clustered Exchange Server 2003 Servers at Microsoft [ http://www.microsoft.com/technet/itsolutions/msit/operations/exchbkup.mspx ] .

Migrate to a New Database

Exchange System Manager provides the Move Mailbox facility for moving all mailbox data from one database or server to another. This can be done while the database is online, and even while users are logged on to their mailboxes. However, most Exchange administrators prefer to schedule a general outage when moving mailboxes so that individual users do not experience a short disconnect when each mailbox is moved.

In Exchange Server 2003, mailbox moves can be scheduled and batched. In conjunction with Microsoft Office Outlook's Exchange Cached Mode, the interruption in service when each mailbox is moved often goes unnoticed by end users, who can continue to work from a cached copy of the mailbox.

For public folder databases, each folder can be migrated to a different server by replication. If additional replicas of all folders already exist on other servers, you can migrate all data by removing all replicas from the problem database. This will trigger a final synchronization of folders from this database to the other replicas.

After Exchange System Manager shows that replication has finished for all folders in a public folder database, you may delete the original database files. When you mount a database again after deleting its files, a new, empty database is generated. You can then replicate folders from other public folder servers back to this new database, if desired.

Migrating data to a different database leaves behind any –1018 or 1019 problems because bad pages will not be used during the move or replication operations. Unlike using a restore and roll forward strategy, migrating data will not recover the information that was on the bad page. It will definitely leave the bad data behind.

A particular message, folder, or mailbox may fail to move, and you may notice a simultaneous –1018 error in the application log. This can allow you to identify the error and the data affected by it. In Exchange Server 2003, new move mailbox logging can report details about each message that fails to move, or can skip mailboxes that show errors during a mass move operation. For more details about configuring, batching, and logging mailbox move operations, refer to Exchange Server 2003 online Help.

Sometimes, a single bad page can affect multiple users. This is because of single instance storage. In an Exchange database, if a copy of a message is sent to multiple users, only one copy of the message is stored, and all users share a link to it.

Sometimes, the data migration will complete with no errors, even though you know there are –1018 problems in the database. This will happen if the bad page is in a database structure such as a secondary index. Such structures are not moved, but are rebuilt after data is migrated. If the Move Mailbox or replication operations complete with no errors, this indicates the bad page was in a section of the database that could be reconstructed, or in a structure such as a secondary index that could be discarded. In these cases, migrating from the database does result in a zero data loss recovery.

Moving or replicating all the data in a 50-gigabyte (GB) database can take a day or two. Therefore, if you choose a migration strategy, you must believe that the server is stable enough to remain in service long enough to complete the operation.

Repair the Database

The Eseutil and Information Store Integrity Checker (Isinteg.exe) tools are installed on every Exchange server and administrative workstation. These tools can be used to delete bad pages from the database and restore logical consistency to the remaining data.

Repairing a database typically results in some loss of data. Because Exchange treats a bad page as completely unreadable, nothing that was on the page will be salvaged by a repair. In some cases, repair may be possible with zero data loss, if the bad page is in a structure that can be discarded or reconstructed. The majority of pages in an Exchange database contain user data. Therefore, the chance that a repair will result in zero data loss is low.

Repair is a multiple stage procedure:

  1. Make a copy of the database files in a safe, stable location.

  2. Run Eseutil in repair mode (/P command-line switch). This removes bad pages and restores logical consistency to individual database tables.

  3. Run Eseutil in defragmentation mode (/D command-line switch). This rebuilds secondary indexes and space trees in the database.

  4. Run Isinteg in fix mode (-Fix command-line switch). This restores logical consistency to the database at the application level. For example, if several messages were lost during repair, Isinteg will adjust folder item counts to reflect this, and will remove missing message header lines from folders.

Typically, repairing a database takes much longer than restoring it from a backup and rolling it forward. The amount of time required varies depending on the nature of the damage and the performance of the system. As an estimate, the repair process often takes about one hour per 10 GB of data. However, it is not uncommon for it to be several times faster or slower than this estimate.

Repair also requires additional disk space for its operations. You must have space equivalent to the size of the database files. If this space is not available on the same drive, you can specify temporary files on other drives or servers, but doing so will dramatically reduce the speed of repair.

Because repair is slow and usually results in some data loss, it should be used as a recovery strategy only when you cannot restore from a backup and roll the database forward.

There may be cases where you have a good backup, but are unable to roll the database forward. You can then combine the restoration and repair strategies to recover the maximal amount of data. This option is explored in more detail in the next section.

The database repair tools have been refined and improved continually since the first version of Exchange was released, and they are typically effective in restoring full logical consistency to a database. Despite the effectiveness of repair, Microsoft IT considers repair an emergency strategy to be used only if restoration is impossible. Because Microsoft IT is stringent about Exchange backup procedures, repair is almost never used except as part of the hybrid strategy described in the next section.

After repairing a database, Microsoft IT policy is to migrate all folders or mailboxes to a new database rather than to run a repaired database indefinitely in production.

Restore, Repair, and Merge Data

There is a hybrid recovery strategy that can be used if you are unable to roll forward with a restored database because a disaster has destroyed necessary transaction log files.

In this scenario, an older, but good, copy of the database is restored from a backup. Because the transaction logs needed for zero loss recovery are unavailable, the restored database is missing all changes since the backup was taken.

However, the damaged database likely contains the majority of this missing data. The goal is to merge the contents of the damaged database with the restored database, thus recovering with minimal data loss.

To do this, the damaged database is moved to an alternate location where it can be repaired while the restored database is running and servicing users. In Exchange Server 2003, you can use the recovery storage group feature to do the restoration and repair on the same server. In previous versions of Exchange, it was necessary to copy the database to a different server to repair it and merge data.

Bulk merge of data between mailbox databases can be accomplished in two ways:

  • Run the Mailbox Merge Wizard (ExMerge) . You can download ExMerge from the Microsoft Download Center [ http://www.microsoft.com/downloads/details.aspx?FamilyID=429163EC-DCDF-47DC-96DA-1C12D67327D5&displaylang=en ] . ExMerge will copy mailbox contents between databases, suppressing copying of duplicate messages, and allowing you to filter the data merge based on timestamps, folders, and other criteria. ExMerge is a powerful and sophisticated tool for extracting and importing mailbox data.

  • Use the Recovery Storage Group Wizard in Exchange System Administrator In Exchange Server 2003 SP1. The Recovery Storage Group Wizard merges mailbox contents from a database mounted in the recovery storage group to a mounted copy of the original database. Like ExMerge, the Recovery Storage Group Wizard suppresses duplicates, but it does not provide other filtering choices. For the majority of data salvage operations, duplicate suppression is all that is required. In most cases, the Recovery Storage Group Wizard provides core ExMerge functionality, but is simpler to use.

Alternate Server Restoration

Exchange allows restoration of a backup created on one server to a different server. In this scenario, you create a storage group and database on the destination server, and restore the backup to it. You can also copy log files from one server to another to roll the database forward.

This strategy may be necessary if the original server is deteriorating rapidly, and you must find an alternate location quickly to host the database. You can restore either an online backup or offline copies of the databases to the alternate server.

After the database has been restored, you must redirect Active Directory® directory service accounts to the mailboxes now homed on the new server. This can be done by:

  • In Exchange Server 2003, use the Remove Exchange Attributes task for all users with mailboxes in the database, followed by using the Mailbox Recovery Center to automatically reconnect all users to the mailboxes on the new server.

  • Use a script for the Active Directory attribute changes to redirect Active Directory accounts to the new server.

This is an advanced strategy. You may want to consult with Microsoft Product Support Services if it becomes necessary to use it, and you have not successfully accomplished it in the past. This strategy may also require the editing or re-creation of client Outlook profiles.

Best Practices

Microsoft IT manages approximately 95 Exchange mailbox servers that host 100,000 mailboxes worldwide. In the last year, there have been six occurrences of error –1018 across all these servers, with the errors limited to two servers.

One server had four errors and another had two errors. In the first case, the root cause was traced to a specific hardware failure. The second server is still under investigation because the two errors occurred very close together in time, but have not occurred since.

Microsoft IT has seen a general trend of decreasing numbers of –1018 errors year over year. This corresponds with the experience of many Exchange administrators who see fewer –1018 errors in Exchange today than in years past. Administrators often assume that the decrease in these errors must be due to improvements in Exchange. However, the credit really belongs to hardware vendors who are continually increasing the reliability and scalability of their products. Microsoft's primary contribution has been to point out problems that the vendors have then solved.

Along with using reliable enterprise-class hardware for your Exchange system, there are several best practices used by Microsoft IT that you can implement to reduce even further the likelihood of encountering data file corruption.

Hardware Configuration and Maintenance

Follow these best practices:

  • Disable hardware write back caching on disk drives that contain Exchange data, or ensure you have a reliable controller that can maintain its cache state if power is interrupted.

    It is important to distinguish here between caching on a disk drive and caching on a disk controller. You should always disable write back caching on a disk drive that hosts Exchange data, but you may enable it on the disk controller if the controller can preserve the cache contents through a power outage.

    When Exchange writes to its transaction logs and database files, it orders the operating system to flush those writes to disk immediately. Nearly all modern disk controllers report to the operating system that writes have been flushed to a disk before they actually have. This means that disks and controllers must ensure that writes have succeeded in case there is a power outage. There is nothing an application can do to reliably override disk system behavior and actually force writes to be secured to a disk.

  • Change cache batteries in disk controllers, uninterruptible power supplies (UPSs), and other power interruption recovery systems as manufacturers recommend. A failed battery is a common reason for data corruption after a power failure.

  • Test systems before putting them in production. Microsoft IT uses Jetstress for burn-in testing of new Exchange systems. The Medusa Labs Test Tool Suite from Finisar is normally used in Microsoft IT only for advanced forensic analysis after less sophisticated tools have not been able to reproduce a problem.

  • Test the actual drive rebuild and hot swap capabilities of your disk system for both performance and data integrity reasons. It is possible that the performance of a system will be so greatly impacted during a drive rebuild operation that it becomes unusable. There have also been cases where the drive rebuild functionality has become unstable when disks have remained under heavy load during a drive rebuild operation.

  • Power down server and disk systems in the order and by the methods recommended by manufacturers. You should know the expected shutdown times for your systems, and at which points a hard shutdown is safe or risky. Many server systems take much longer to shut down than consumer computer systems. The experience of Microsoft Product Support Services is that impatience during shutdown is an all too common cause of data corruption.

  • Standardize the hardware platform used for Exchange. Not only does this improve general server manageability, but it also makes troubleshooting and analysis of errors across servers easier.

  • Stay current on upgrades for servers, disk controllers, switches and other firmware, and software that manage disks and disk I/O.

  • Verify with your vendor that the disk controllers used with Exchange support atomic I/O, and find out the atomicity value.

    To support atomic I/O is to support writing all of the data that an application requests in a single I/O or to write none of it. For example, if an application sends a 64-KB write to a disk, and a hard failure occurs during the write, the result should be that none of the write is preserved on a disk. Atomicity involves all or nothing

    Without atomic I/O, you are vulnerable to torn pages where a chunk of disk may be composed of a mixture of old and new data. In the 64 KB example, it may be that the first 32 KB is new data and the last 32 KB is old data. In Exchange, a torn 4-KB write to the database will certainly result in a –1018 error.

    The atomicity value refers to the largest single write that the controller guarantees to write on an all or nothing basis. For example, this might be 128 KB: for any I/O request less than 128 KB, the write will happen atomically, or, in effect all at once with no possibility of a partial write. However, for write requests greater than 128 KB, there may be no such guarantee.

    Exchange issues database write commands in 4 KB or smaller chunks. Therefore, on a drive hosting only Exchange databases, a write atomicity of 4 KB is required.

Operations

Follow these best practices:

  • Place Exchange databases and transaction log files in separate disk groups. As a rule, Exchange log files should never be placed on the same physical drives as Exchange database files. There are two important reasons for this:

  • Fault tolerance. If the disks hosting Exchange database files also hold the transaction logs, loss of these disks will result in loss of both database and transaction log files. This will make rolling the database forward from a backup impossible.

  • Performance. The disk I/O characteristics for an Exchange database are a high amount of random 4-KB reads and writes, typically with twice as many reads as writes. For Exchange transaction log files, I/O is sequential and consists only of writes. As a rule, mixing sequential and random I/O streams to the same disk results in significant performance degradation.

  • Track all Exchange data corruption issues across all Exchange servers. This provides you data for trend analysis and troubleshooting of subtle platform flaws. For more information, see "Appendix B: –1018 Record Keeping " later in this document.

  • Preserve Windows event logs. It is all too common for event logs generated during the bookend period to be cleared or automatically overwritten. (For details, see "Bookending" earlier in this document.) The event logs are important for root cause analysis. If you are running Exchange in a cluster, ensure that event log replication is configured, or that you gather and preserve the event logs from every node in the cluster, whether actively running Exchange or not.

Conclusion

For most organizations, huge amounts of important data are managed in Microsoft Exchange database files. Current server class computer hardware is very reliable but it is not perfect. Because Exchange data files compose many gigabytes or even terabytes of storage, it is inevitable that the database files will occasionally be damaged by storage failures.

While no administrator welcomes the appearance of a –1018 error, the error prevents data corruption from going undetected, and often provides you with an early warning before problems become serious enough that a catastrophic failure occurs.

Every –1018 error should be logged (as described in Appendix B). Moreover, every –1018 requires some kind of recovery strategy to restore data integrity (as described above in "Recovering from a –1018 Error"). However, not every –1018 error indicates failing or defective hardware.

At Microsoft, a rate of one error –1018 per 100 Exchange servers per year is considered normal and to be expected. This "1 in 100" acceptable error rate is based on Microsoft's experience with the limits of hardware reliability.

Microsoft IT will replace hardware or undertake a root cause investigation if any of the following conditions exist:

  • The –1018 error is associated with other errors or symptoms that indicate failures or defects in the system.

  • More than one –1018 error has occurred on the same system.

  • 1018 errors begin occurring above the "1 in 100" threshold on multiple systems of the same type.

While there may be nothing you can do about the fact that –1018 errors occur, you can reduce the incidence of errors. If you are experiencing –1018 errors at a rate greater than one or two a year per 100 Exchange servers, the root cause analysis advice and practices outlined in this paper can be of practical benefit to you. Even if you are not experiencing excessive rates of this problem, we hope that the recovery methods suggested in this paper will help you recover more quickly and effectively

For More Information

For more information about Microsoft products or services, call the Microsoft Sales Information Center at (800) 426-9400. In Canada, call the Microsoft Canada information Centre at (800) 563-9048. Outside the 50 United States and Canada, please contact your local Microsoft subsidiary. To access information through the World Wide Web, go to:

http://www.microsoft.com [ http://www.microsoft.com/ ]

http://www.microsoft.com/itshowcase [ http://www.microsoft.com/itshowcase ]

http://www.microsoft.com/technet/itshowcase [ http://www.microsoft.com/technet/itsolutions/msit/ ]

For any questions, comments, or suggestions on this document, or to obtain additional information about How Microsoft Does IT, please send e-mail to:

showcase@microsoft.com

Appendix A: Case Studies

This section outlines two case studies of actual –1018 investigations, conducted jointly by Microsoft, third-party vendors, and Exchange customers. For privacy reasons, the names of the customers and vendors are omitted, and identifying details may have been changed.

These investigations are not typical of what is required to identify the root cause for the majority of –1018 errors. Rather, they illustrate the more subtle and difficult cases that are sometimes encountered. In both cases, trending –1018 errors across a common platform was critical to the investigation.

Case Study 1

An Exchange customer with nearly 100 Exchange servers in production was experiencing occasional but recurring –1018 errors on a minority of the servers. All servers used for Exchange were from the same manufacturer, with two different models used depending on the role and load of the server. Errors occurred, seemingly at random, in both server models.

Ordinary diagnostics showed nothing wrong with any of the servers. If a –1018 error occurred on a server, another error might not occur for several months. Microsoft personnel recommended taking some of the servers out of production and running extended Jetstress tests. These tests also revealed nothing. Although all the servers were all similar to each other, only a minority of the servers (about 20 percent) ever experienced –1018 problems. Still, this was far above a reasonable threshold for random errors, and so the server platform was considered suspect.

Microsoft personnel recommended tracking each –1018 error that happened across all servers in a single spreadsheet. (For details, see "Appendix B: –1018 Record Keeping " later in this document.) This technique would allow confirmation of subjective impressions and allow better analysis of subtle patterns that might have been overlooked.

Over time, 17 errors were logged in the spreadsheet and a pattern did emerge. For most of the –1018 errors, the twenty-eighth bit of the checksum was wrong. If it was not the twenty-eighth bit, it was the twenty-third or the thirty-second bit.

One of the characteristics of an Exchange checksum is that if an error introduced on a page is a single bit error (a bit flip), the checksum on the page will also differ from the checksum that should be on the page by only a single bit.

For example, suppose a –1018 error is reported with these characteristics:

  • Expected checksum (that is actually on the page): 39196aa6

  • Actual checksum (calculated as the page is read): 38196aa6.

Checksums are stored in little endian format on an Exchange page. The actual checksum on the page is therefore derived by reversing the order of the four bytes that make up the eight-digit checksum:

  • The number 51 79 f5 33 becomes 33 f5 79 51.

  • The number 41 79 f5 33 becomes 33 f5 79 41.

To determine whether two checksums match each other except for a single bit, you must convert them to binary and then use the XOR logical operator. An XOR operation compares each bit of one checksum to the corresponding bit of the other. If the bits are the same (both 0 or both 1), the XOR result is 0. If the bits are different, the XOR result is 1. Therefore, a single bit difference between two numbers will result in an XOR result with exactly a single 1 in it. If more than a single bit was changed on a page, the XOR checksum results will be off by more than a single bit. An illustration of this is shown in Table 1.

Checksums Hexadecimal Binary

Expected checksum

51 79 f5 33

00110011 11110101 01111001 01010001

Actual checksum

41 79 f5 33

00110011 11110101 01111001 01000001

XOR Result

XOR Result

00000000 00000000 00000000 00010000

Table 1. Checksum XOR Analysis

Patterns in –1018 corruptions are often a valuable clue for hardware vendors in identifying an elusive problem. Along with logging the checksum discrepancies, it is also useful to dump the actual damaged page for direct analysis. (For details, see "Appendix B: –1018 Record Keeping " later in this document.)

A server was finally discovered where the problem happened more than once within a short time frame. Jetstress tests were able to consistently create new –1018 errors, almost always manifesting as a change in the twenty-eighth bit of the checksum. The server was shipped to Microsoft for analysis. The errors could not be reproduced despite weeks of stress testing and diagnostics performed by both Microsoft and the manufacturer.

In the meantime, the customer noticed that –1018 errors had begun to occur on Active Directory domain controllers as well as on Exchange servers. The Active Directory database is based on the same engine as the Exchange database, and it also detects and reports –1018 errors.

It was noticed that the errors seemed to occur on the Active Directory servers after restarting the servers remotely with a hardware reset card. Investigators at Microsoft tried restarting the test server in the same way and were eventually successful in reproducing the problem.

At this point, it might seem that the reset card was the most likely suspect. However, the error did not occur every time after a restart with the card. Most of the time, there was no issue. Long Jetstress runs could be done sometimes with no errors, and then suddenly all Jetstress runs would fail serially.

Eventually, it became apparent that the problem could be reproduced almost every seventh restart with the card. It was not the fault of the card, but the fact that the card performed a complete cold restart of the server, simulating a power reset.

After every seventh cold restart, the server would become unstable. This state would last through warm restarts until the next cold restart, at which time the server would be stable again until after another six cold restarts.

Both server models in production in the customer's organization used the exact same server component with the same part number. However, only 20 percent of the components were manufactured with this problem, which made it much harder to narrow the cause down to the faulty component.

Case Study 2

A major Exchange customer with 250 Exchange servers was plagued with frequent –1018 errors on multiple servers and multiple SAN disk systems. Rarely did a week go by without a full-scale –1018 recovery.

There had been significant data loss multiple times after –1018 errors had occurred. In one case, there was no backup monitoring being done. The most recent Exchange backup had actually been overwritten, with no subsequent backups succeeding. After a month, there was a catastrophic failure on the server, and the database was not salvageable. All user mail was lost. In another case, the first –1018 error corrupted several hundred thousand pages in the database, and the transaction log drives were also affected. Backups had also been neglected on this server as the problem worsened. The most recent backup was several weeks old, and thus all mail since then was lost.

Microsoft Product Support Services had been called multiple times over the last several months and had been mostly successful in recovering data after each problem. However, each of these cases involved individual server operators and Product Support Services engineers, working in isolation on recovery, but not focusing on root cause analysis across all servers.

The data loss cases got the attention of both the Microsoft account team and the Exchange customer's executive management. As Microsoft began correlating cases and asking for more information about the prevalence of the issue, it became clear quickly that the rate of –1018 occurrences was far above the standard threshold.

Information about past issues was mostly unavailable or incomplete. However, Microsoft created a spreadsheet to track each new problem. The spreadsheet started to fill quickly, and patterns began to emerge. The problem was that there was no single pattern, but multiple patterns.

In several cases, the lowest two bytes of the checksum were changed. This seemed promising, but then came several errors where bits 29 and 30 were wrong, with nothing else in common. Then there was an outbreak of errors where there were large-scale checksum differences with no discernible pattern in the checksums or the damaged pages. On some servers, there were multiple bad pages. There were frequent transient –1018 errors, and frequently a checksum on a full database would reveal different errors on successive runs.

The investigation and resolution lasted almost a year. As time went on, it became clear that some servers and disk frames were much more problematic than others, and that this was not just a general problem with all the Exchange servers across the organization. During that year, the following problems were discovered to be root causes of –1018 errors:

  • Server operators were hard cycling servers with disk controllers that had no I/O atomicity guarantees.

  • SANs where there was no logical unit number (LUN) masking, allowed multiple servers to control a single disk simultaneously, and thus corrupt it.

  • Badly out-of-date firmware revisions were in use, including versions known to cause data corruption.

  • Cluster systems had not passed Windows Hardware Quality Labs (WHQL) certification. These clusters had disk controllers that were unable to handle in-flight disk I/Os during cluster state transitions.

  • Antivirus applications were not configured correctly to exclude Exchange data files. This was causing sudden quarantine, deletion, or alteration of Exchange files and processes. Generic file scanning antivirus programs should never be used on Exchange databases. Many vendors have effective Exchange-aware scanners that implement the Microsoft Exchange antivirus APIs.

  • A vendor hardware bug accounted for a minority of the errors.

  • Aging and progressively failing hardware, which had exceeded its lifecycle, caused obvious problems.

Correcting the –1018 root causes was an arduous, but ultimately worthwhile process. It required not only changes to hardware and configurations, but also operational improvements. Not only was the organization successful in dramatically reducing the incidence of –1018 errors, but also in greatly decreasing the impact of each error on end users by implementing effective monitoring and recovery procedures.

This case study contrasts sharply with Case Study 1. In Case Study 1, a mysterious and subtle hardware bug was the single root cause for all the failures. However, for most Exchange administrators, the key to reducing and controlling –1018 errors will be implementing ordinary operational improvements. Most of the time, the patterns revealed by keeping track of –1018 errors across your organization will point to obvious errors and problems that should be defended against. Case Study 1, while perhaps more interesting, was atypical, while Case Study 2 is representative of the process that several Exchange organizations have gone through to control and reduce –1018 errors.

Apendix B: –1018 Record Keeping

For the majority of –1018 errors, the root cause will be indicated by another correlated error or failure. For errors where the cause is not so obvious, tracking –1018 errors across time and across servers is critical for identifying the root cause.

Even for errors where the root cause is easily determined, there is still value in consistently tracking –1018 errors. You can learn how the errors affect your organization, and where operational and other improvements could reduce the impact of the errors.

You may want to track errors in a database, in a spreadsheet, or using a simple text file. At Microsoft, Microsoft Office Excel 2003 spreadsheets are used. The following list of fields can be adapted to your needs and your willingness to track detailed information.

Essentials

These files should always be saved for each –1018 error:

  • Application and system logs for the bookend period from the time when the –1018 error was reported and the time of the last good backup.

  • Page dumps.

Eseutil Page Dump

This Eseutil facility will show you the contents of important header fields on the page. This command requires the logical page number. You can calculate the logical page number from the error description as described in "Page Ordering" earlier in this document.

If, for example, logical page 578 is damaged in the database file Priv1.edb, you can dump the page to the file 578.txt with this command:

Eseutil.exe /M priv1.edb /P578 ? 578.txt

Note that there is no space between the /P switch and the page number.

The output of this command might look similar to this:

Microsoft(R) Exchange Server Database Utilities

Version 6.5

Copyright (C) Microsoft Corporation. All Rights Reserved .

Initiating FILE DUMP mode...

Database: priv1.edb

Page: 578

checksum <0x03300000, 8>: 2484937984258 (0x0000024291d88902)

expected checksum = 0x0000024291d88902

****** checksum mismatch ******

actual checksum = 0x00de00de91d889fd

new checksum format

expected ECC checksum = 0x00000242

actual ECC checksum = 0x00de00de

expected XOR checksum = 0x91d88902

actual XOR checksum = 0x91d889fd

checksum error is NOT correctable

dbtimeDirtied <0x03300008, 8>: 12701 (0x000000000000319d)

pgnoPrev <0x03300010, 4>: 577 (0x00000241)

pgnoNext <0x03300014, 4>: 579 (0x00000243)

objidFDP <0x03300018, 4>: 114 (0x00000072)

cbFree <0x0330001C, 2>: 6 (0x0006)

cbUncommittedFree <0x0330001E, 2>: 0 (0x0000)

ibMicFree <0x03300020, 2>: 4038 (0x0fc6)

itagMicFree <0x03300022, 2>: 3 (0x0003)

fFlags <0x03300024, 4>: 10370 (0x00002882)

Leaf page

Primary page

Long Value page

New record format

New checksum format

TAG 0 cb:0x0000 ib:0x0000 offset:0x0028-0x0027 flags:0x0000

TAG 1 cb:0x000e ib:0x0000 offset:0x0028-0x0035 flags:0x0001 (v)

TAG 2 cb:0x0fb8 ib:0x000e offset:0x0036-0x0fed flags:0x0001 (v)

If you do not see a checksum mismatch in the dump, that does not necessarily mean that the –1018 error is transient. It is possible that a mistake was made in calculating the logical page number. It is a good idea to double-check your arithmetic, and to dump the preceding and next pages as well if you do not find a –1018 error on the dumped page. Running Eseutil /K against the entire database will also provide an additional check.

Required Error Information

For each –1018 occurrence, you should always log the following:

  • Application log –1018 event information:

  • Date and time

  • Server name

  • Event ID

  • Event description

  • If a cluster, cluster node where the error occurred

  • Server make and model

  • Storage type:

  • Direct access storage device (DASD)

  • Fiber Channel Storage Area Network (SAN)

  • Internet small computer system interface (iSCSI) SAN

  • Network-attached storage

  • Storage make and model:

  • Disk controller

  • Multiple path configuration

  • Permanent location or share for event, log, and dump files

Additional Information

For each 1018 occurrence, you can also note the following:

  • Bookend period anomalies:

  • Restart

  • Cluster transition

  • Disk error

  • Memory error

  • Other

  • File offset

  • Logical page number (calculated from byte offset)

  • Actual checksum (calculated at run time)

  • Expected checksum (read from page)

  • Binary actual checksum

  • Binary expected checksum

  • Checksum XOR result

  • How discovered (run time, mount failure, or backup failure)

  • Server unavailable or available

  • Last good backup time

  • Error confirmed by, such as: Eseutil /m /p, /k

  • Permanent or transient error

  • Location of files (Eseutil and Esefile page dumps, raw page dumps, MPSReports)

  • Server hardware

  • Server BIOS

  • Controller

  • Controller firmware revision

  • Storage

  • Impact (databases affected)

  • Recovery downtime

  • Recovery strategy

  • Root cause

  • Comments تعليقات

  • Entry by

XOR Calculation Sample for Excel

Appendix A described how to compare checksums to look for patterns. The Microsoft Office Excel formulas below can be used to automate this comparison. You must install the Analysis Toolpak for Excel for the necessary functions to be available. The Toolpak can be installed from the Tools, Add-Ins menu in Excel.

Converting a Hexadecimal Checksum to Binary

Copy this formula into an Excel cell. This formula assumes that the hexadecimal checksum is in cell A1. If the hexadecimal checksum is in a different cell, change each reference to A1 in the formula to represent the actual cell. Ignore line breaks in the formula—it is intended to be a single line in Excel:

=CONCATENATE(HEX2BIN(MID(A1,7,2),8)," ",HEX2BIN(MID(A1,5,2),8),"
",HEX2BIN(MID(A1,3,2),8)," ",HEX2BIN(MID(A1,1,2),8))

This formula also reverses each byte of the checksum to conform to the Intel little endian storage format.

Using XOR with Two Binary Checksums

This formula assumes that the binary checksums are in cells B1 and B2. If the checksums are in other cells, replace each occurrence of B1 or B2 as appropriate. Ignore line breaks in the formula—it is intended to be a single line in Excel:

=CONCATENATE((HEX2BIN(BIN2HEX(VALUE(SUBSTITUTE(MID(B1,1,8)+MID(B2,1,8),2,0
)),8),8))," ",
(HEX2BIN(BIN2HEX(VALUE(SUBSTITUTE(MID(B1,10,8)+MID(B2,10,8),2,0)),8),8)),"
",(HEX2BIN(BIN2HEX(VALUE(SUBSTITUTE(MID(B1,19,8)+MID(B2,19,8),2,0)),8),8)),"
",(HEX2BIN(BIN2HEX(VALUE(SUBSTITUTE(MID(B1,28,8)+MID(B2,28,8),2,0)),8),8)))


Situation

Error –1018 signals that an Exchange database file has been damaged by a hardware or file system problem. Exchange reports this error to provide early warning of possible server failure and data loss.

Solution

This paper shows you how Microsoft IT responds to this error and recovers affected Exchange data. It also covers the methods and tools used to find root causes and resolve the underlying problems responsible for the error.

Benefits

  • Improve your monitoring of Exchange data integrity.
  • Increase your ability to determine seriousness and urgency of –1018 errors.
  • Learn specific recovery strategies and how to decide when to implement them.
  • Improve your operational effectiveness in handling hardware and data integrity problems.

Products & Technologies

  • Microsoft Exchange Server 2003
  • Microsoft Windows Server 2003
  • Exchange Jetstress and LoadSim I/O and capacity modeling tools
  • Microsoft Office Excel 2003
  • Medusa Labs Test Tool Suite by Finisar
  • Exchange Eseutil and Isinteg repair and integrity verification tools

Popularity: 1%


Written by Teus. Read more great feeds at is source
WEBSITE
no comments لا تعليقات . .
Read more articles on اقرأ المزيد المواد المتعلقة otherSoftware otherSoftware . .

Related articles

No comments

There are still no comments on this article.

Leave your comment...

If you want to leave your comment on this article, simply fill out the next form:




You can use these XHTML tags: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <code> <em> <i> <strike> <strong> .