Mantis Bug Tracker

View Issue Details Jump to Notes ] Issue History ] Print ]
IDProjectCategoryView StatusDate SubmittedLast Update
0000433OCamlCamlIDLpublic2001-07-13 19:332001-07-30 16:26
Reporteradministrator 
Assigned To 
PrioritynormalSeverityminorReproducibilityalways
StatusclosedResolutionfixed 
PlatformOSOS Version
Product Version 
Target VersionFixed in Version 
Summary0000433: camlidl: problem with GC (2)
DescriptionFull_Name: Dmitry Bely
Version: camlidl current cvs
OS: Windows NT 4.0
Submission from: d207.p3.col.ru (195.210.132.207)


From my PR#431:

> Unfortunately I cannot create small test example that demonstrates the bug.
But
> I can suppose that it happens because GC moves in memory the caml object
> representing the interface, which obviously is not acceptable. Can this be
true?
> If yes, could you fix it?

Of course moving caml object in memory by GC is very correct (I was wrong), and
is traced via register_global_root(), but nonetheless the bug exists there. I've
managed to isolate it myself. comintf.c from camlidl's runtime contains he
following code:

[---cut---]
static void camlidl_finalize_interface(value intf)
{
  interface IUnknown * i = (interface IUnknown *) Field(intf, 1);
  i->lpVtbl->Release(i);
}

value camlidl_pack_interface(void * intf, camlidl_ctx ctx)
{
  value res = alloc_final(2, camlidl_finalize_interface, 0, 1);
  Field(res, 1) = (value) intf;
  if (ctx != NULL && (ctx->flags & CAMLIDL_ADDREF)) {
    struct IUnknown * i = (struct IUnknown *) intf;
    i->lpVtbl->AddRef(i);
  }
  return res;
}

...

ULONG STDMETHODCALLTYPE camlidl_Release(struct camlidl_intf * this)
{
  struct camlidl_component * comp = this->comp;
  ULONG newrefcount = InterlockedDecrement(&(comp->refcount));
  int i;

  if (newrefcount == 0) {
    for (i = 0; i < comp->numintfs; i++) {
      struct camlidl_intf * intf = &(comp->intf[i]);
      remove_global_root(&(intf->caml_object));
      if (intf->typeinfo != NULL) {
        struct IUnknown * i = (struct IUnknown *) intf->typeinfo;
        i->lpVtbl->Release(i);
      }
    }
    stat_free(comp);
    InterlockedDecrement(&camlidl_num_components);
  }
  return newrefcount;
}
[---cut---]

The problem is that camlidl_finalize_interface() may be invoked when GC moves
object in memory, calling i->lpVtbl->Release(i) (in fact camlidl_Release()) and
decrementing reference counter. It becomes equal to zero and the memory,
occupied by the interface, is incorrectly reclaimed via stat_free(comp). That's
the bug.

I temporarily removed i->lpVtbl->Release(i) from camlidl_finalize_interface()
and the crash disappeared (memory leak is better than the crash :-)). Now I am
waiting for the correct fix from you :-) (BTW, why *both*
camlidl_pack_interface() and camlidl_unpack_interface() increment reference
counter via AddRef()? I cannot understand that ...)

TagsNo tags attached.
Attached Files

- Relationships

-  Notes
(0000053)
administrator (administrator)
2001-07-15 19:41

> Of course moving caml object in memory by GC is very correct (I was
> wrong), and is traced via register_global_root(), but nonetheless
> the bug exists there. I've managed to isolate it myself. comintf.c
> from camlidl's runtime contains he following code:
> The problem is that camlidl_finalize_interface() may be invoked when
> GC moves object in memory, calling i->lpVtbl->Release(i) (in fact
> camlidl_Release()) and decrementing reference counter. It becomes
> equal to zero and the memory, occupied by the interface, is
> incorrectly reclaimed via stat_free(comp). That's the bug.

I don't think it's a bug. The GC calls camlidl_finalize_interface()
when the corresponding Caml object becomes unreferenced from Caml.
Then, it decreases the refcount to signal that Caml no longer holds a
reference to the COM object. If the refcount becomes zero, this means
that nobody (including your C++ code) is still holding a reference to
the object. Hence, deallocating it is correct. Isn't it?

In other terms, if your C++ code still accesses the COM object after
its refcount is 0, then probably your code forgot to increment the
refcount as per the COM spec.

> (BTW, why *both*
> camlidl_pack_interface() and camlidl_unpack_interface() increment reference
> counter via AddRef()? I cannot understand that ...)

I'm not in the office right now, and therefore cannot check in the COM
specs, but I seem to remember this is the required behavior:
pack_interface() keeps a pointer to the COM object in an OCaml object,
resulting in one more reference; unpack_interface() returns a
reference to the C/C++ caller, which also results in one more
reference.

- Xavier Leroy

(0000054)
administrator (administrator)
2001-07-16 11:53

Xavier Leroy <Xavier.Leroy@inria.fr> writes:

> > Of course moving caml object in memory by GC is very correct (I was
> > wrong), and is traced via register_global_root(), but nonetheless
> > the bug exists there. I've managed to isolate it myself. comintf.c
> > from camlidl's runtime contains he following code:
> > The problem is that camlidl_finalize_interface() may be invoked when
> > GC moves object in memory, calling i->lpVtbl->Release(i) (in fact
> > camlidl_Release()) and decrementing reference counter. It becomes
> > equal to zero and the memory, occupied by the interface, is
> > incorrectly reclaimed via stat_free(comp). That's the bug.
>
> I don't think it's a bug. The GC calls camlidl_finalize_interface()
> when the corresponding Caml object becomes unreferenced from Caml.
> Then, it decreases the refcount to signal that Caml no longer holds a
> reference to the COM object. If the refcount becomes zero, this means
> that nobody (including your C++ code) is still holding a reference to
> the object. Hence, deallocating it is correct. Isn't it?
>
> In other terms, if your C++ code still accesses the COM object after
> its refcount is 0, then probably your code forgot to increment the
> refcount as per the COM spec.

Well, it just means that I did not understand some GC basics (now I hope I
am closer to that) and the bug is somethere else. My code is something
like that:

Caml DLL side:
[---cut---]
...
let factory () =
  let obj = new camlClass in
  let iIntf1 = make_iIntf1 obj
  and iIntf2 = make_iIntf2 obj in
    Com.combine iIntf1 iIntf2

let _ =
  Com.register_factory
    { Com.create = factory;
      ... }
[---cut---]

C++ side:
[---cut---]
testComponent(const CLSID & clsid)
{
  IIntf1 *pIIntf1 = NULL;
  HRESULT hr = ::CoCreateInstance(clsid,
                                    NULL,
                                    CLSCTX_INPROC_SERVER,
                                    IID_IIntf1,
                                    (void**)&pIIntf1) ;
  CHECK_RES;
  hr = pIPlatePatterns->method1();
  CHECK_RES;
  hr = pIPlatePatterns->method2();
  ...
[---cut---]

caml object gets finalized inside method1() and method2() just crashes.
 
> In other terms, if your C++ code still accesses the COM object after
> its refcount is 0, then probably your code forgot to increment the
> refcount as per the COM spec.

As you see I don't play with refcount myself.

> > (BTW, why *both*
> > camlidl_pack_interface() and camlidl_unpack_interface() increment reference
> > counter via AddRef()? I cannot understand that ...)
>
> I'm not in the office right now, and therefore cannot check in the COM
> specs, but I seem to remember this is the required behavior:
> pack_interface() keeps a pointer to the COM object in an OCaml object,
> resulting in one more reference; unpack_interface() returns a
> reference to the C/C++ caller, which also results in one more
> reference.

I think now I know (finally :-)) where is the bug.

[runtime/cfactory.cpp]
  virtual HRESULT __stdcall CreateInstance(IUnknown* pUnknownOuter,
                                           const IID& iid,
                                           void** object)
  {
    // Aggregation is not supported yet
    if (pUnknownOuter != NULL) return CLASS_E_NOAGGREGATION;
    // Create the component
    value vcomp =
      callback(Field(this->comp->compdata, COMPDATA_CREATE), Val_unit);
    IUnknown * comp = (IUnknown *) camlidl_unpack_interface(vcomp, NULL);
        //
        // My comment:
        // Wrong! camlidl_unpack_interface(vcomp, NULL) does not increment
        // refcount! You should use
        // struct camlidl_ctx_struct ctx = { CAMLIDL_ADDREF, NULL };
        // IUnknown * comp = (IUnknown *) camlidl_unpack_interface(vcomp, &ctx);
        // or remove comp->Release() below.
        //
    // Get the requested interface
    HRESULT res = comp->QueryInterface(iid, object);
    // Release the initial pointer to the component
    // (if QueryInterface failed, it will destroy itself)
    comp->Release();
    // Return result of QueryInterface
    return res;
  }
[end of runtime/cfactory.cpp]

Hope to hear from you soon,
Dmitry


(0000055)
administrator (administrator)
2001-07-16 18:50

Dmitry Bely <dbely@mail.ru> writes:

> I think now I know (finally :-)) where is the bug.
>
> [runtime/cfactory.cpp]

Oh. Seems I was wrong again. I definitely should not bother you with my
suppositions (crying that I know what should be fixed), but this bug is the
*real* showstopper for me.

The problem is possibly why caml tries to GC an object that is still
not released by the client. My next supposition is that
camlidl_com_combine() does not lock the combined object via
register_global_root() (only its parts are locked), and so GC just removes
caml value, representing the combined object. What do you think of this?

Hope to hear from you soon,
Dmitry


(0000056)
administrator (administrator)
2001-07-16 21:25

> Oh. Seems I was wrong again. I definitely should not bother you with my
> suppositions (crying that I know what should be fixed),

No problem. It seems like refcount handling is broken somewhere in
the CamlIDL runtime code, and I will look into this at some point next
week (I'm at a conference this week). Any bug-tracking work that you
could do in the meantime will be very useful! Actually, if you have a
self-contained program that reproduces the crash, I would be
interested in getting a copy.

Best wishes,

- Xavier Leroy

(0000057)
administrator (administrator)
2001-07-17 12:12

Xavier Leroy <Xavier.Leroy@inria.fr> writes:

> It seems like refcount handling is broken somewhere in
> the CamlIDL runtime code,

I think no (although as usual I may be wrong :-). The bug is in the runtime
library but it has nothing to do with refcounting. IMHO the problem happens
when we use Com.combine:

value camlidl_com_combine(value vintf1, value vintf2)
{
  struct camlidl_intf * i1, * i2;
  struct camlidl_component * c1, * c2, * c;
  int n, i;

  i1 = camlidl_unpack_interface(vintf1, NULL);
  i2 = camlidl_unpack_interface(vintf2, NULL);
  if (! is_a_caml_interface(i1) || ! is_a_caml_interface(i2))
    camlidl_error(CLASS_E_NOAGGREGATION, "Com.combine",
                  "Not a Caml interface");
  c1 = i1->comp;
  c2 = i2->comp;
  n = c1->numintfs + c2->numintfs;
  c = (struct camlidl_component *)
        stat_alloc(sizeof(struct camlidl_component) +
                   sizeof(struct camlidl_intf) * (n - 1));
  InterlockedIncrement(&camlidl_num_components);
  c->numintfs = n;
  c->refcount = 1;
  for (i = 0; i < c1->numintfs; i++)
    c->intf[i] = c1->intf[i];
  for (i = 0; i < c2->numintfs; i++)
    c->intf[c1->numintfs + i] = c2->intf[i];
  for (i = 0; i < n; i++) {
    register_global_root(&(c->intf[i].caml_object));
    c->intf[i].comp = c;
  }
  return camlidl_pack_interface(c->intf + (i1 - c1->intf), NULL);
}

value camlidl_pack_interface(void * intf, camlidl_ctx ctx)
{
  value res = alloc_final(2, camlidl_finalize_interface, 0, 1);
  Field(res, 1) = (value) intf;
  if (ctx != NULL && (ctx->flags & CAMLIDL_ADDREF)) {
    struct IUnknown * i = (struct IUnknown *) intf;
    i->lpVtbl->AddRef(i);
  }
  return res;
}

camlidl_com_combine() returns new interface value, but GC does not know
this value is used elsewhere (no register_global_root() is set for it) and
finalizes it when it needs more memory.

> and I will look into this at some point next
> week (I'm at a conference this week). Any bug-tracking work that you
> could do in the meantime will be very useful! Actually, if you have a
> self-contained program that reproduces the crash, I would be
> interested in getting a copy.

No problem. I've just modified your test example a bit:

[---cut---]
Index: camlcomp.ml
===================================================================
RCS file: /caml/bazar-ocaml/camlidl/tests/comp/camlcomp.ml,v
retrieving revision 1.1
diff -u -w -r1.1 camlcomp.ml
--- camlcomp.ml 1999/02/22 17:42:23 1.1
+++ camlcomp.ml 2001/07/17 06:37:33
@@ -3,7 +3,9 @@
 class mycomponent =
   object
     method fx =
- print_string "Camlcomp: fx"; print_newline()
+ Array.create 1000000 0; print_string "Camlcomp: fx"; print_newline()
+ method fx2 =
+ Array.create 1000000 0; print_string "Camlcomp: fx2"; print_newline()
     method fy n =
       print_string "Camlcomp: fy "; print_int n; print_newline()
     method fz n =
@@ -20,6 +22,7 @@
   Com.combine (Com.combine ix iy) iz
 
 let _ =
+ Gc.set { (Gc.get()) with Gc.space_overhead = 1; Gc.max_overhead = 0; Gc.verbose = 0xff };
   Com.register_factory
     { Com.create = factory;
       Com.clsid = Com.clsid "aab56090-c721-11d2-8e2b-0060974fbf19";
Index: CLIENT.CPP
===================================================================
RCS file: /caml/bazar-ocaml/camlidl/tests/comp/CLIENT.CPP,v
retrieving revision 1.1
diff -u -w -r1.1 CLIENT.CPP
--- CLIENT.CPP 1999/02/22 17:42:23 1.1
+++ CLIENT.CPP 2001/07/17 06:37:33
@@ -24,6 +24,7 @@
     {
         trace("Succeeded getting IX.") ;
         pIX->Fx() ; // Use interface IX.
+ pIX->Fx2() ; // Use interface IX.
 
         trace("Ask for interface IY.") ;
         IY* pIY = NULL ;
Index: CMPNT.CPP
===================================================================
RCS file: /caml/bazar-ocaml/camlidl/tests/comp/CMPNT.CPP,v
retrieving revision 1.1
diff -u -w -r1.1 CMPNT.CPP
--- CMPNT.CPP 1999/02/22 17:42:23 1.1
+++ CMPNT.CPP 2001/07/17 06:37:34
@@ -44,6 +44,7 @@
 
     // Interface IX
     virtual HRESULT __stdcall Fx() { cout << "Fx" << endl ; return S_OK; }
+ virtual HRESULT __stdcall Fx2() { cout << "Fx2" << endl ; return S_OK; }
 
     // Interface IY
     virtual HRESULT __stdcall Fy(int x)
Index: component.idl
===================================================================
RCS file: /caml/bazar-ocaml/camlidl/tests/comp/component.idl,v
retrieving revision 1.1
diff -u -w -r1.1 component.idl
--- component.idl 1999/02/22 17:42:23 1.1
+++ component.idl 2001/07/17 06:37:34
@@ -1,6 +1,7 @@
 [object, uuid(32bb8320-b41b-11cf-a6bb-0080c7b2d682)]
 interface IX : IUnknown {
   HRESULT Fx();
+ HRESULT Fx2();
 }
 
 [object, uuid(32bb8321-b41b-11cf-a6bb-0080c7b2d682)]
Index: IFACE.H
===================================================================
RCS file: /caml/bazar-ocaml/camlidl/tests/comp/IFACE.H,v
retrieving revision 1.1
diff -u -w -r1.1 IFACE.H
--- IFACE.H 1999/02/22 17:42:23 1.1
+++ IFACE.H 2001/07/17 06:37:34
@@ -6,6 +6,7 @@
 interface IX : IUnknown
 {
     virtual HRESULT pascal Fx() = 0 ;
+ virtual HRESULT pascal Fx2() = 0 ;
 };
[---cut---]

It crashes when fx2() method is called.

Hope to hear from you soon,
Dmitry


(0000058)
administrator (administrator)
2001-07-30 16:01

I finally got time to look at the CamlIDL problem you mentioned.
Thanks for the sample code; with the help of generous insertion of
printf() statements, it does confirm your analysis:

> I think now I know (finally :-)) where is the bug.
>
> [runtime/cfactory.cpp]
> virtual HRESULT __stdcall CreateInstance(IUnknown* pUnknownOuter,
> const IID& iid,
> void** object)
> {
> // Aggregation is not supported yet
> if (pUnknownOuter != NULL) return CLASS_E_NOAGGREGATION;
> // Create the component
> value vcomp =
> callback(Field(this->comp->compdata, COMPDATA_CREATE), Val_unit);
> IUnknown * comp = (IUnknown *) camlidl_unpack_interface(vcomp, NULL);
> //
> // My comment:
> // Wrong! camlidl_unpack_interface(vcomp, NULL) does not increment
> // refcount! You should use
> // struct camlidl_ctx_struct ctx = { CAMLIDL_ADDREF, NULL };
> // IUnknown * comp = (IUnknown *) camlidl_unpack_interface(vcomp, &ctx);
> // or remove comp->Release() below.
> //
> // Get the requested interface
> HRESULT res = comp->QueryInterface(iid, object);
> // Release the initial pointer to the component
> // (if QueryInterface failed, it will destroy itself)
> comp->Release();
> // Return result of QueryInterface
> return res;
> }

You are very correct that the refcount must be incremented, otherwise
there are really two references to the interface "comp" when this
function returns: one is from a Caml object in the heap, which is
unreachable and will soon be garbage collected, and another one is the
one we return via the "*object" out parameter.

I will fix this as suggested in your comment.

Apart from this, I looked at the implementation of Com.combine more
carefully and didn't see anything weird. (Famous last words!)

Thanks again for your solid testing of Camlidl.

- Xavier Leroy

(0000059)
administrator (administrator)
2001-07-30 16:26

Fixed by XL 2001-07-30

- Issue History
Date Modified Username Field Change
2005-11-18 10:13 administrator New Issue


Copyright © 2000 - 2011 MantisBT Group
Powered by Mantis Bugtracker