mycroes

There's always time to play

Thursday, June 17, 2010

Recovering from glue objects in OpenLDAP

After some syncing issues and a few transfers of /var/lib/ldap between servers, our company LDAP database had lost it's root organization entry. Doing a slapcat resulted in the entry listed with objectClass glue and all of it's attributes gone. However, this was the same at all of our servers.

The first thing that came to mind to fix this issue was doing an ldapmodify on the entry, however ldapmodify would return ldap_modify: No such object (32). The logical next step would then be to add the object, since ldapmodify complains it's not there... However, that would result in ldap_add: Already exists (68)! Amazing, one program telling me the object can't be modified because it's not there, the other telling me I can't add it because it exists.

I did some searching, but couldn't find a proper solution or anyone with a similar issue. I could of course start from scratch, but that would destroy the sync status, modified timestamp, modifier's name, create timestamp and creators name and perhaps even more, so that wouldn't really be an option in my humble opinion.

During my (re)search I did come across slapadd. slapadd can be used to do offline database edits (at least additions to the database). So I stopped slapd, and fired up slapadd and entered my LDIF... Same issue! The entry exists, so it can't be added. slapadd doesn't seem to support modify either (I'm not complaining, just stating the facts), so I had to figure out something else...

Suddenly I had it all figured out. slapadd and slapcat are similar tools in that they operate directly on the database instead of talking to slapd. Thus if you slapcat your database you can give the output back to slapadd!
# slapcat -n 1 > entries.ldif
# slapadd -n 1 -l entries.ldif

Of course this very simple code example will result in similar errors, because all your entries are already there. Besides, it would also be nice to edit the broken entry while we're at it, which will result in the following list of commands to complete it all (code assumes broken tree is database number 1, replace with your database index if it's not the first database):
  1. # cp -ar /var/lib/ldap{,.bak}
  2. # slapcat -n 1 > entries.ldif
  3. # rm -r /var/lib/ldap
  4. # mkdir -p /var/lib/ldap/bdb
    This line assumes a BDB database, you can probably replace bdb with hdb if you're using HDB
  5. Now edit entries.ldif so your entry makes sense again. Just fix the objectClass (be sure to create a correct objectClass chain, i.e. top, dcObject, organization), structuralObjectClass and attributes required by the newly set objectClasses (i.e. dc, o).
  6. # slapadd -n 1 -l entries.ldif

Now your entry should be back again, with a proper objectClass and related attributes. If you get errors along the way, make sure there aren't more entries with attributes that aren't available in the schema files. Just remove the incorrect attributes (and probably incorrect objectClasses accompanying the attributes) from the LDIF and repeat the database delete and add steps (or remove everything earlier in the LDIF and just add the new entries using slapadd, of course!)

The last step would be to index the database. I don't know if it's required (slapd will run fine without), but before starting slapd run the following:
# slapindex -n 1

Now your LDAP tree should be back to a proper state again!

There's just one issue left... If you didn't change contextCSN attributes, slapd won't sync the entry to other servers because they will all think the entry never changed (and thus the other servers will keep the broken entry). There's an easy solution: just use ldapmodify to change an attribute and the contextCSN will update and the change will propagate to the other servers. The real fix would be to change the contextCSN for the rid of the server you're editing to the current time, however this is more prone to mistakes and the result should be the same (unless using delta syncrepl, where it is possible that only the change will get propagated.)

This was my not-so-short introduction to LDAP disaster recovery without losing contextual information. I'm hoping you enjoyed reading this post and that it helped you to recover from long-standing errors.

7 comments:

Anonymous said...

There is an easier way, at least on OpenLDAP 2.4.24 (which is the only one I've tested. Create the LDIF to change the entries back to what they should be by replacing each attribute. Then use ldapmodify with the -M option, which turns on the manageDSAit control. This causes the entries to be treated as normal, not as referrals, and the entries then replicated normally across my environment.

Unknown said...

Hello guys!


Thanks for the 2 solutions.

Jeff Medcalf, your solution works perfectly!


The question is: why?? why are they tranformed into "glue" objects?

I have 24 openldap consumers (centers) + 1 openldap provider.
When it passes in "glue" objects,It affects the functioning of openldap: accounts deleted (cn=dummy, for exemple, which serves has to connect several Web tools), configurations deleted etc.

It has been 2 months since I look for a solution, one beginning of answer.

Thanks very much!

Michael Croes said...

Hi Francois,

As far as I know these objects can exist when they're not yet replicated, but their existence is replicated (thus, glue object is created). The attributes should be transferred later, but at that point the original object might be gone, causing the glue object to stick around. Hope this helps you along.
Regards,

Michael

Unknown said...

Hi guys,

I come back. I had other projects to deal.

I read that on "http://www.openldap.org/doc/admin24/replication.html", section "18.1.1.2. Syncrepl Details":
"Because a general search filter can be used in the syncrepl specification, some entries in the context may be omitted from the synchronization content. The syncrepl engine creates a glue entry to fill in the holes in the replica context if any part of the replica content is subordinate to the holes. The glue entries will not be returned in the search result unless ManageDsaIT control is provided."

Please, anyone can explain me that?

The reason of the appearance of "glue" objects will be a bad "search filter" config?

"unless ManageDsaIT control is provided" > i don't understand.


Thanks very much!

Francois

Michael Croes said...

I normal conditions the glue object will be resolved when OpenLDAP finds it. However, if it can't be resolved (because there's no original object anymore), it will just return the glue object. Note that this is based on my experience only, I didn't check any of this in the code.

grawity said...

Recent OpenLDAP versions now support the "relax" control, which allows the structural objectClass to be replaced directly over LDAP (using "ldapmodify -e relax").

Though it seems that it's already implied for glue entries, so "ldapmodify -M" is enough to fix this particular issue.

Udo Rader said...

more than 10 years later and I am still running into this very same problem. Our DIT is neither big nor do we have bad or slow network connectivity.

When setting up new consumers, from time to time some of them fail to persist the real objects but keep the glue objects instead. This can be 1:1 consumers, containing the entire DIT (with no searchbase configured at all) but also consumers limited by a searchbase.

Yes, I can manually "fix" the broken entries on the consumers with ldapmodify -M, but for a critical resource like LDAP this is a very frightening workaround.

Nevertheless, thanks for putting up the workaround and making me feel less alone with this strange problem :)