Mantis Bugtracker

Viewing Issue Simple Details Jump to Notes ] View Advanced ] Issue History ] Print ]
ID Category Severity Reproducibility Date Submitted Last Update
0000010 [MyDNS-NG] Global crash have not tried 2008-06-10 22:45 2014-08-07 18:58
Reporter jorge View Status public  
Assigned To howardwilkinson
Priority normal Resolution fixed  
Status resolved   Product Version
Summary 0000010: MyDNS consuming 100% CPU when database fails
Description Hey Howard,
I had this 3 times, and the last one, it wasn't the database that went off, at least that i noticed.
The main thing is, when the database is unavailable, it start consuming 100% CPU causing the server to stop responding even if the database cames online again.
Can it try to check when the database is online to start working again without the administrator killing and starting over again?
Additional Information
Tags No tags attached.
Attached Files txt file icon mydns_valgrind.txt [^] (8,047 bytes) 2008-09-20 23:02
zip file icon mydns.zip [^] (1,021,800 bytes) 2008-12-04 20:02
zip file icon mydns-with-debug.zip [^] (169,004 bytes) 2008-12-04 20:02

- Relationships

-  Notes
(0000031)
howardwilkinson (administrator)
2008-07-23 08:44

The database code needs a major rewrite for various reasons. Not least of which so that reconnects and retries as well as fail over can be made to work. Will schedule when I get back to MyDNS in anger.
(0000032)
jorge (administrator)
2008-08-27 09:53

Howard, is it possible that the server goes down when it received request like this? i say this because it went down in the same time.

----

Aug 27 09:40:48 cisne mydns[13182]: 149.20.52.166: TXT version.bind. (540) NEED_ANSWER, High Priority Normal Task: NOTIMP - qclass not available
(0000033)
jorge (administrator)
2008-08-27 17:35

Ah, other info that may be useful, is almost the same:

Aug 27 09:40:48 cisne mydns[13182]: 149.20.52.166: TXT version.bind. (540) NEED_ANSWER, High Priority Normal Task: NOTIMP - qclass not available

I was checking these logs dates, and then it failed and they match with not much difference than a few seconds.

Can it be when TXT records are requested?
This is my sctructure, it may help you:
-----------
+----------+---------------------------------------------------------------------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+----------+---------------------------------------------------------------------------+------+-----+---------+----------------+
| id | int(10) | NO | PRI | NULL | auto_increment |
| zone | int(10) | NO | MUL | 0 | |
| name | char(64) | NO | | | |
| data | varbinary(128) | NO | | NULL | |
| aux | int(10) unsigned | NO | | NULL | |
| ttl | int(10) unsigned | NO | | 86400 | |
| type | enum('A','AAAA','CNAME','HINFO','MX','NAPTR','NS','PTR','RP','SRV','TXT') | YES | | A | |
| edata | blob | YES | | NULL | |
| edatakey | char(32) | YES | | NULL | |
+----------+---------------------------------------------------------------------------+------+-----+---------+----------------+
9 rows in set (0.01 sec)
(0000034)
jorge (administrator)
2008-08-27 18:26

Extra info, but it won't help alot i think, this was today also, in one of those two messages below:

---
 mydns[9884]: segfault at 8d23f14 ip 08071659 sp bfd2be00 error 6 in
 mydns[8048000+32000]
(0000035)
jorge (administrator)
2008-09-20 22:00

Howard, Drew found the source of the problem (or at least one)

on windows, if you do:

nslookup \\\.domain.pt. ns1.domain.pt

where "domain.pt" has to be a domain in the MyDNS Database, it'll crash.
I compiled MyDNS with -g flag, and with gdb i got only this:
---
mydns[2336]: mydns pre-1.2.8 started Sat Sep 20 21:33:31 2008 (listening on 2 addresses)

Program received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0xb7cfc6b0 (LWP 2336)]
0xb7d6e623 in strlen () from /lib/libc.so.6
(gdb) bt
#0 0xb7d6e623 in strlen () from /lib/libc.so.6
0000001 0x08074504 in ?? ()
(0000036)
jorge (administrator)
2008-09-20 23:01

Howard,
Good news i think, with valgrind i seams to catch the problem.
Check the attach log and give me some feedback.
(0000037)
howardwilkinson (administrator)
2008-11-05 08:29

Jorge et al,

I believe I have fixed this memory leak in the 1.2.8.5 patch release. Need it testing and feedback before I clear the bug.

I intend a code review as a result of this bug to check that we are catching failures paths - as that is what this was.

Howard.
(0000043)
jorge (administrator)
2008-11-05 09:43

Ok Howard,
I'm already using it on my server.
I'll keep the eye on him and if everything's OK i'll close this.
(0000048)
howardwilkinson (administrator)
2008-11-15 11:01

Major changes in the scheduling code, plus some bug fixes may have fixed this in 1.2.8.11+ Need to be verified still happens.
(0000049)
jorge (administrator)
2008-11-15 22:23

I've been tracking your changes on the SVN and seeing your emails to the mailing list.
Till now everything's OK, and it seams more stable.
Anything i'll let you know, if it doesn't happen anymore i'll close this.
(0000050)
peterschen (reporter)
2008-12-04 19:58

So. This place is maybe better to track this bug down. I also checked out the sources recently but as I'm coming more from Java/C# world it'll take me a while to be able to help really....

I just figured that I made one run with debug enabled but then I disabled debug and let the server run again. As the "timeout" to trigger the problem takes a lot more time now I'm not sure that I'm able to let the server run in debug because the performance decreases significantly with debug enabled and the server is a production machine...
(0000052)
jorge (administrator)
2009-01-26 13:21

Howard,
Finnaly we can close this i think.
I'm just going to wait more 2/3 days to see if it stay's stable and i'll close this nasty bug!
(0000054)
jorge (administrator)
2009-02-03 22:30

Solved Howard.

- Issue History
Date Modified Username Field Change
2008-06-10 22:45 jorge New Issue
2008-06-10 22:45 jorge Status new => assigned
2008-06-10 22:45 jorge Assigned To => howardwilkinson
2008-07-23 08:44 howardwilkinson Note Added: 0000031
2008-08-27 09:53 jorge Note Added: 0000032
2008-08-27 17:35 jorge Note Added: 0000033
2008-08-27 18:26 jorge Note Added: 0000034
2008-09-20 22:00 jorge Note Added: 0000035
2008-09-20 23:01 jorge Note Added: 0000036
2008-09-20 23:02 jorge File Added: mydns_valgrind.txt
2008-11-05 08:29 howardwilkinson Note Added: 0000037
2008-11-05 09:43 jorge Note Added: 0000043
2008-11-15 11:01 howardwilkinson Note Added: 0000048
2008-11-15 22:23 jorge Note Added: 0000049
2008-12-04 19:58 peterschen Note Added: 0000050
2008-12-04 20:02 peterschen File Added: mydns.zip
2008-12-04 20:02 peterschen Issue Monitored: peterschen
2008-12-04 20:02 peterschen Issue End Monitor: peterschen
2008-12-04 20:02 peterschen Issue Monitored: peterschen
2008-12-04 20:02 peterschen File Added: mydns-with-debug.zip
2009-01-26 13:21 jorge Note Added: 0000052
2009-02-03 22:30 jorge Note Added: 0000054
2009-02-03 22:30 jorge Status assigned => resolved
2009-02-03 22:30 jorge Fixed in Version => 1.2.8.23
2009-02-03 22:30 jorge Resolution open => fixed
2009-09-08 12:08 peterschen Issue End Monitor: peterschen
2014-08-07 18:58 jameno123 Target Version Trunk => 1.2.8.23


Mantis 1.1.6[^]
Copyright © 2000 - 2008 Mantis Group
Powered by Mantis Bugtracker