NSX-T 3.2 GA + NTA = NSX Intelligence DOWN

It’s bug time again ….

For anyone setting up NTA on their shiny new NSX-T 3.2 deployment you may have hit a situation where everything goes “pete tong”, as us Brits like to say, when you enable the NTA Detectors. Well it’s down to two specific detectors using the wrong docker registries:

  1. Horizontal Port Scanner
  2. Non Standard Port Usage

There is a fix, but it requires you to edit the deployment configuration for the NTA-Server. My advice then is to update the deployment before you enable these detectors, however if you have already done so then you simply do the change below and then delete your pods and they should get redeployed.

Firstly SSH to your Kubernetes Control Plane Node then complete the following:

Edit the NTA Server Deployment

kubectl -n nsxi-platform edit deployment nta-server

Look for env variable called NTAFLOW_IMAGE

———
  containers:
      – env:
        – name: SPRING_CONFIG_LOCATION
          value: /opt/vmware/pace/config/application.yaml
        – name: POD_IP
          valueFrom:
            fieldRef:
              apiVersion: v1
              fieldPath: status.podIP
        – name: POD_NAMESPACE
          valueFrom:
            fieldRef:
              apiVersion: v1
              fieldPath: metadata.namespace
        – name: SPARK_HOME
          value: /opt/spark
        – name: SERVICE_ACCOUNT_NAME
          value: nta-server-sa
        – name: NTAFLOW_IMAGE
          value: harbor-repo.vmware.com/nsx_intelligence/clustering/nta-flow:19238496
——————

Replace harbor-repo.vmware.com with your docker registry info. For example, if my registry was livefire-labs.dev, this line would look like:

 – name: NTAFLOW_IMAGE

value: livefire-labs.dev/nsx_intelligence/clustering/nta-flow:19238496

Finally Save and exit.

At this point you should be able to enable the detectors and everything should work. Like I said, if you’ve already enabled everything then just delete the existing pods deployed and they should redeploy automatically.

This WILL be fixed in an up coming release, but I know a few contacts are eager to play with the NTA + NDR features of NSX-T 3.2, so hopefully this will reduce some head scratching in the mean time.

FINAL HINT:

If you are running an environment with NSX Intelligence in evaluation mode you might need to increase the vCPU of the work-node(s) in your cluster from 16 vCPU to 24 vCPU. This was required in our environment as we were running multiple NAPP services, and the anomaly detection pod was failing to run due to a CPU constraint.

Hope that helps you all.

Bal Birdy

Leave a Reply

%d bloggers like this: