For this reason, it might be helpful to understand the way NARs are loaded in NiFi. When a NAR is loaded by NiFi, a NarClassLoader is created for it. A NarClassLoader is an URLClassLoader that contains all the JAR dependencies needed by that NAR, such as third-party libraries, NiFi utilities, etc. If the NAR definition includes a parent NAR, then the NarClassLoader's parent is the NarClassLoader for the parent NAR. This allows all NARs with the same parent to have access to the same classes, which alleviates certain classloader issues when talking between NARs / utilities. One pervasive example is the specification of an "API NAR" such as "nifi-standard-services-api-nar", which enables the child NARs to use the same API classes/interfaces.
All NARs (and all child ClassLoaders in Java) have the following class loaders in their parent chain (listed from top to bottom):
- Bootstrap class loader
- Extensions class loader
- System class loader
You can consult the Wiki page for Java ClassLoader for more information on these class loaders, but in the NiFi context just know that the System class loader (aka Application ClassLoader) includes all the JARs from the lib/ folder (but not the lib/bootstrap folder) under the NiFi distribution directory.
To help in debugging classloader issues, either on a standalone node or a cluster, I wrote a simple flow using ExecuteScript with Groovy to send out a flow file per NAR, whose contents include the classloader chain (including which JARs belong to which URLClassLoader) in the form:
The script is as follows:
To help in debugging classloader issues, either on a standalone node or a cluster, I wrote a simple flow using ExecuteScript with Groovy to send out a flow file per NAR, whose contents include the classloader chain (including which JARs belong to which URLClassLoader) in the form:
<classloader_object> <path_to_jar_file> <path_to_jar_file> <path_to_jar_file> ... <classloader_object> <path_to_jar_file> <path_to_jar_file> <path_to_jar_file> ...The classloaders are listed from top to bottom, so the first will always be the extensions classloader, followed by the system classloader, etc. The NarClassLoader for the given NAR will be at the bottom.
The script is as follows:
import java.net.URLClassLoader
import org.apache.nifi.nar.NarClassLoaders
NarClassLoaders.instance.extensionClassLoaders.each { c ->
def chain = []
while(c) {
chain << c
c = c.parent
}
def flowFile = session.create()
flowFile = session.write(flowFile, {outputStream ->
chain.reverseEach { cl ->
outputStream.write("${cl.toString()}\n".bytes)
if(cl instanceof URLClassLoader) {
cl.getURLs().each {
outputStream.write("\t${it.toString()}\n".bytes)
}
}
}
} as OutputStreamCallback)
session.transfer(flowFile, REL_SUCCESS)
}
This can be used in a NiFi flow, perhaps using LogAttribute or PutFile to display the results of each NAR's classloader hierarchy.
Note that these are the classloaders that correspond to a NAR, not the classloaders that belong to instances of processors packaged in the NAR. For runtime information about the classloader chain associated with a processor instance, I will tackle that in another blog post :)
Please let me know if you find this useful, As always suggestions, questions, and improvements are welcome. Cheers!